kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2022-10-30	sched/psi: Use task->psi_flags to clear in CPU migration	Chengming Zhou	3	-22/+5
	The commit d583d360a620 ("psi: Fix psi state corruption when schedule() races with cgroup move") fixed a race problem by making cgroup_move_task() use task->psi_flags instead of looking at the scheduler state. We can extend task->psi_flags usage to CPU migration, which should be a minor optimization for performance and code simplicity. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/r/20220926081931.45420-1-zhouchengming@bytedance.com
2022-10-30	sched/psi: Stop relying on timer_pending() for poll_work rescheduling	Suren Baghdasaryan	2	-10/+53
	Psi polling mechanism is trying to minimize the number of wakeups to run psi_poll_work and is currently relying on timer_pending() to detect when this work is already scheduled. This provides a window of opportunity for psi_group_change to schedule an immediate psi_poll_work after poll_timer_fn got called but before psi_poll_work could reschedule itself. Below is the depiction of this entire window: poll_timer_fn wake_up_interruptible(&group->poll_wait); psi_poll_worker wait_event_interruptible(group->poll_wait, ...) psi_poll_work psi_schedule_poll_work if (timer_pending(&group->poll_timer)) return; ... mod_timer(&group->poll_timer, jiffies + delay); Prior to 461daba06bdc we used to rely on poll_scheduled atomic which was reset and set back inside psi_poll_work and therefore this race window was much smaller. The larger window causes increased number of wakeups and our partners report visible power regression of ~10mA after applying 461daba06bdc. Bring back the poll_scheduled atomic and make this race window even narrower by resetting poll_scheduled only when we reach polling expiration time. This does not completely eliminate the possibility of extra wakeups caused by a race with psi_group_change however it will limit it to the worst case scenario of one extra wakeup per every tracking window (0.5s in the worst case). This patch also ensures correct ordering between clearing poll_scheduled flag and obtaining changed_states using memory barrier. Correct ordering between updating changed_states and setting poll_scheduled is ensured by atomic_xchg operation. By tracing the number of immediate rescheduling attempts performed by psi_group_change and the number of these attempts being blocked due to psi monitor being already active, we can assess the effects of this change: Before the patch: Run#1 Run#2 Run#3 Immediate reschedules attempted: 684365 1385156 1261240 Immediate reschedules blocked: 682846 1381654 1258682 Immediate reschedules (delta): 1519 3502 2558 Immediate reschedules (% of attempted): 0.22% 0.25% 0.20% After the patch: Run#1 Run#2 Run#3 Immediate reschedules attempted: 882244 770298 426218 Immediate reschedules blocked: 881996 769796 426074 Immediate reschedules (delta): 248 502 144 Immediate reschedules (% of attempted): 0.03% 0.07% 0.03% The number of non-blocked immediate reschedules dropped from 0.22-0.25% to 0.03-0.07%. The drop is attributed to the decrease in the race window size and the fact that we allow this race only when psi monitors reach polling window expiration time. Fixes: 461daba06bdc ("psi: eliminate kthread_worker from psi trigger scheduling mechanism") Reported-by: Kathleen Chang <yt.chang@mediatek.com> Reported-by: Wenju Xu <wenju.xu@mediatek.com> Reported-by: Jonathan Chen <jonathan.jmchen@mediatek.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: SH Chen <show-hong.chen@mediatek.com> Link: https://lore.kernel.org/r/20221028194541.813985-1-surenb@google.com
2022-10-30	sched/psi: Fix avgs_work re-arm in psi_avgs_work()	Chengming Zhou	2	-3/+30
	Pavan reported a problem that PSI avgs_work idle shutoff is not working at all. Because PSI_NONIDLE condition would be observed in psi_avgs_work()->collect_percpu_times()->get_recent_times() even if only the kworker running avgs_work on the CPU. Although commit 1b69ac6b40eb ("psi: fix aggregation idle shut-off") avoided the ping-pong wake problem when the worker sleep, psi_avgs_work() still will always re-arm the avgs_work, so shutoff is not working. This patch changes to use PSI_STATE_RESCHEDULE to flag whether to re-arm avgs_work in get_recent_times(). For the current CPU, we re-arm avgs_work only when (NR_RUNNING > 1 \|\| NR_IOWAIT > 0 \|\| NR_MEMSTALL > 0), for other CPUs we can just check PSI_NONIDLE delta. The new flag is only used in psi_avgs_work(), so we check in get_recent_times() that current_work() is avgs_work. One potential problem is that the brief period of non-idle time incurred between the aggregation run and the kworker's dequeue will be stranded in the per-cpu buckets until avgs_work run next time. The buckets can hold 4s worth of time, and future activity will wake the avgs_work with a 2s delay, giving us 2s worth of data we can leave behind when shut off the avgs_work. If the kworker run other works after avgs_work shut off and doesn't have any scheduler activities for 2s, this maybe a problem. Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Chengming Zhou <zhouchengming@bytedance.com> Link: https://lore.kernel.org/r/20221014110551.22695-1-zhouchengming@bytedance.com
2022-10-30	sched/psi: Fix possible missing or delayed pending event	Hao Lee	1	-3/+5
	When a pending event exists and growth is less than the threshold, the current logic is to skip this trigger without generating event. However, from e6df4ead85d9 ("psi: fix possible trigger missing in the window"), our purpose is to generate event as long as pending event exists and the rate meets the limit, no matter what growth is. This patch handles this case properly. Fixes: e6df4ead85d9 ("psi: fix possible trigger missing in the window") Signed-off-by: Hao Lee <haolee.swjtu@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Suren Baghdasaryan <surenb@google.com> Link: https://lore.kernel.org/r/20220919072356.GA29069@haolee.io
2022-10-27	sched: Always clear user_cpus_ptr in do_set_cpus_allowed()	Waiman Long	1	-1/+7
	The do_set_cpus_allowed() function is used by either kthread_bind() or select_fallback_rq(). In both cases the user affinity (if any) should be destroyed too. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220922180041.1768141-6-longman@redhat.com
2022-10-27	sched: Enforce user requested affinity	Waiman Long	2	-0/+13
	It was found that the user requested affinity via sched_setaffinity() can be easily overwritten by other kernel subsystems without an easy way to reset it back to what the user requested. For example, any change to the current cpuset hierarchy may reset the cpumask of the tasks in the affected cpusets to the default cpuset value even if those tasks have pre-existing user requested affinity. That is especially easy to trigger under a cgroup v2 environment where writing "+cpuset" to the root cgroup's cgroup.subtree_control file will reset the cpus affinity of all the processes in the system. That is problematic in a nohz_full environment where the tasks running in the nohz_full CPUs usually have their cpus affinity explicitly set and will behave incorrectly if cpus affinity changes. Fix this problem by looking at user_cpus_ptr in __set_cpus_allowed_ptr() and use it to restrcit the given cpumask unless there is no overlap. In that case, it will fallback to the given one. The SCA_USER flag is reused to indicate intent to set user_cpus_ptr and so user_cpus_ptr masking should be skipped. In addition, masking should also be skipped if any of the SCA_MIGRATE_* flag is set. All callers of set_cpus_allowed_ptr() will be affected by this change. A scratch cpumask is added to percpu runqueues structure for doing additional masking when user_cpus_ptr is set. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220922180041.1768141-4-longman@redhat.com
2022-10-27	sched: Always preserve the user requested cpumask	Waiman Long	2	-55/+72
	Unconditionally preserve the user requested cpumask on sched_setaffinity() calls. This allows using it outside of the fairly narrow restrict_cpus_allowed_ptr() use-case and fix some cpuset issues that currently suffer destruction of cpumasks. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220922180041.1768141-3-longman@redhat.com
2022-10-27	sched: Introduce affinity_context	Waiman Long	3	-47/+85
	In order to prepare for passing through additional data through the affinity call-chains, convert the mask and flags argument into a structure. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220922180041.1768141-5-longman@redhat.com
2022-10-27	sched: Add __releases annotations to affine_move_task()	Waiman Long	1	-1/+3
	affine_move_task() assumes task_rq_lock() has been called and it does an implicit task_rq_unlock() before returning. Add the appropriate __releases annotations to make this clear. A typo error in comment is also fixed. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220922180041.1768141-2-longman@redhat.com
2022-10-27	sched/fair: Check if prev_cpu has highest spare cap in feec()	Pierre Gondois	1	-6/+7
	When evaluating the CPU candidates in the perf domain (pd) containing the previously used CPU (prev_cpu), find_energy_efficient_cpu() evaluates the energy of the pd: - without the task (base_energy) - with the task placed on prev_cpu (if the task fits) - with the task placed on the CPU with the highest spare capacity, prev_cpu being excluded from this set If prev_cpu is already the CPU with the highest spare capacity, max_spare_cap_cpu will be the CPU with the second highest spare capacity. On an Arm64 Juno-r2, with a workload of 10 tasks at a 10% duty cycle, when prev_cpu and max_spare_cap_cpu are both valid candidates, prev_spare_cap > max_spare_cap at ~82%. Thus the energy of the pd when placing the task on max_spare_cap_cpu is computed with no possible positive outcome 82% most of the time. Do not consider max_spare_cap_cpu as a valid candidate if prev_spare_cap > max_spare_cap. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20221006081052.3862167-2-pierre.gondois@arm.com
2022-10-27	sched/fair: Consider capacity inversion in util_fits_cpu()	Qais Yousef	1	-5/+9
	We do consider thermal pressure in util_fits_cpu() for uclamp_min only. With the exception of the biggest cores which by definition are the max performance point of the system and all tasks by definition should fit. Even under thermal pressure, the capacity of the biggest CPU is the highest in the system and should still fit every task. Except when it reaches capacity inversion point, then this is no longer true. We can handle this by using the inverted capacity as capacity_orig in util_fits_cpu(). Which not only addresses the problem above, but also ensure uclamp_max now considers the inverted capacity. Force fitting a task when a CPU is in this adverse state will contribute to making the thermal throttling last longer. Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-10-qais.yousef@arm.com
2022-10-27	sched/fair: Detect capacity inversion	Qais Yousef	2	-3/+79
	Check each performance domain to see if thermal pressure is causing its capacity to be lower than another performance domain. We assume that each performance domain has CPUs with the same capacities, which is similar to an assumption made in energy_model.c We also assume that thermal pressure impacts all CPUs in a performance domain equally. If there're multiple performance domains with the same capacity_orig, we will trigger a capacity inversion if the domain is under thermal pressure. The new cpu_in_capacity_inversion() should help users to know when information about capacity_orig are not reliable and can opt in to use the inverted capacity as the 'actual' capacity_orig. Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-9-qais.yousef@arm.com
2022-10-27	sched/uclamp: Cater for uclamp in find_energy_efficient_cpu()'s early exit ↵	Qais Yousef	1	-6/+8
	condition If the utilization of the woken up task is 0, we skip the energy calculation because it has no impact. But if the task is boosted (uclamp_min != 0) will have an impact on task placement and frequency selection. Only skip if the util is truly 0 after applying uclamp values. Change uclamp_task_cpu() signature to avoid unnecessary additional calls to uclamp_eff_get(). feec() is the only user now. Fixes: 732cd75b8c920 ("sched/fair: Select an energy-efficient CPU on task wake-up") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-8-qais.yousef@arm.com
2022-10-27	sched/uclamp: Make cpu_overutilized() use util_fits_cpu()	Qais Yousef	1	-1/+4
	So that it is now uclamp aware. This fixes a major problem of busy tasks capped with UCLAMP_MAX keeping the system in overutilized state which disables EAS and leads to wasting energy in the long run. Without this patch running a busy background activity like JIT compilation on Pixel 6 causes the system to be in overutilized state 74.5% of the time. With this patch this goes down to 9.79%. It also fixes another problem when long running tasks that have their UCLAMP_MIN changed while running such that they need to upmigrate to honour the new UCLAMP_MIN value. The upmigration doesn't get triggered because overutilized state never gets set in this state, hence misfit migration never happens at tick in this case until the task wakes up again. Fixes: af24bde8df202 ("sched/uclamp: Add uclamp support to energy_compute()") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-7-qais.yousef@arm.com
2022-10-27	sched/uclamp: Make asym_fits_capacity() use util_fits_cpu()	Qais Yousef	1	-8/+13
	Use the new util_fits_cpu() to ensure migration margin and capacity pressure are taken into account correctly when uclamp is being used otherwise we will fail to consider CPUs as fitting in scenarios where they should. s/asym_fits_capacity/asym_fits_cpu/ to better reflect what it does now. Fixes: b4c9c9f15649 ("sched/fair: Prefer prev cpu in asymmetric wakeup path") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-6-qais.yousef@arm.com
2022-10-27	sched/uclamp: Make select_idle_capacity() use util_fits_cpu()	Qais Yousef	1	-3/+5
	Use the new util_fits_cpu() to ensure migration margin and capacity pressure are taken into account correctly when uclamp is being used otherwise we will fail to consider CPUs as fitting in scenarios where they should. Fixes: b4c9c9f15649 ("sched/fair: Prefer prev cpu in asymmetric wakeup path") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-5-qais.yousef@arm.com
2022-10-27	sched/uclamp: Fix fits_capacity() check in feec()	Qais Yousef	3	-10/+68
	As reported by Yun Hsiang [1], if a task has its uclamp_min >= 0.8 * 1024, it'll always pick the previous CPU because fits_capacity() will always return false in this case. The new util_fits_cpu() logic should handle this correctly for us beside more corner cases where similar failures could occur, like when using UCLAMP_MAX. We open code uclamp_rq_util_with() except for the clamp() part, util_fits_cpu() needs the 'raw' values to be passed to it. Also introduce uclamp_rq_{set, get}() shorthand accessors to get uclamp value for the rq. Makes the code more readable and ensures the right rules (use READ_ONCE/WRITE_ONCE) are respected transparently. [1] https://lists.linaro.org/pipermail/eas-dev/2020-July/001488.html Fixes: 1d42509e475c ("sched/fair: Make EAS wakeup placement consider uclamp restrictions") Reported-by: Yun Hsiang <hsiang023167@gmail.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-4-qais.yousef@arm.com
2022-10-27	sched/uclamp: Make task_fits_capacity() use util_fits_cpu()	Qais Yousef	2	-10/+25
	So that the new uclamp rules in regard to migration margin and capacity pressure are taken into account correctly. Fixes: a7008c07a568 ("sched/fair: Make task_fits_capacity() consider uclamp restrictions") Co-developed-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-3-qais.yousef@arm.com
2022-10-27	sched/uclamp: Fix relationship between uclamp and migration margin	Qais Yousef	1	-0/+123
	fits_capacity() verifies that a util is within 20% margin of the capacity of a CPU, which is an attempt to speed up upmigration. But when uclamp is used, this 20% margin is problematic because for example if a task is boosted to 1024, then it will not fit on any CPU according to fits_capacity() logic. Or if a task is boosted to capacity_orig_of(medium_cpu). The task will end up on big instead on the desired medium CPU. Similar corner cases exist for uclamp and usage of capacity_of(). Slightest irq pressure on biggest CPU for example will make a 1024 boosted task look like it can't fit. What we really want is for uclamp comparisons to ignore the migration margin and capacity pressure, yet retain them for when checking the _actual_ util signal. For example, task p: p->util_avg = 300 p->uclamp[UCLAMP_MIN] = 1024 Will fit a big CPU. But p->util_avg = 900 p->uclamp[UCLAMP_MIN] = 1024 will not, this should trigger overutilized state because the big CPU is now actually being saturated. Similar reasoning applies to capping tasks with UCLAMP_MAX. For example: p->util_avg = 1024 p->uclamp[UCLAMP_MAX] = capacity_orig_of(medium_cpu) Should fit the task on medium cpus without triggering overutilized state. Inlined comments expand more on desired behavior in more scenarios. Introduce new util_fits_cpu() function which encapsulates the new logic. The new function is not used anywhere yet, but will be used to update various users of fits_capacity() in later patches. Fixes: af24bde8df202 ("sched/uclamp: Add uclamp support to energy_compute()") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220804143609.515789-2-qais.yousef@arm.com
2022-10-24	Linux 6.1-rc2v6.1-rc2	Linus Torvalds	1	-1/+1

2022-10-24	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	18	-99/+180
	Pull kvm fixes from Paolo Bonzini: "RISC-V: - Fix compilation without RISCV_ISA_ZICBOM - Fix kvm_riscv_vcpu_timer_pending() for Sstc ARM: - Fix a bug preventing restoring an ITS containing mappings for very large and very sparse device topology - Work around a relocation handling error when compiling the nVHE object with profile optimisation - Fix for stage-2 invalidation holding the VM MMU lock for too long by limiting the walk to the largest block mapping size - Enable stack protection and branch profiling for VHE - Two selftest fixes x86: - add compat implementation for KVM_X86_SET_MSR_FILTER ioctl selftests: - synchronize includes between include/uapi and tools/include/uapi" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: tools: include: sync include/api/linux/kvm.h KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter() kvm: Add support for arch compat vm ioctls RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc RISC-V: Fix compilation without RISCV_ISA_ZICBOM KVM: arm64: vgic: Fix exit condition in scan_its_table() KVM: arm64: nvhe: Fix build with profile optimization KVM: selftests: Fix number of pages for memory slot in memslot_modification_stress_test KVM: arm64: selftests: Fix multiple versions of GIC creation KVM: arm64: Enable stack protection and branch profiling for VHE KVM: arm64: Limit stage2_apply_range() batch size to largest block KVM: arm64: Work out supported block level at compile time
2022-10-23	Revert "mfd: syscon: Remove repetition of the regmap_get_val_endian()"	Jason A. Donenfeld	1	-0/+8
	This reverts commit 72a95859728a7866522e6633818bebc1c2519b17. It broke reboots on big-endian MIPS and MIPS64 malta QEMU instances, which use the syscon driver. Little-endian is not effected, which means likely it's important to handle regmap_get_val_endian() in this function after all. Fixes: 72a95859728a ("mfd: syscon: Remove repetition of the regmap_get_val_endian()") Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Lee Jones <lee@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-23	kernel/utsname_sysctl.c: Fix hostname polling	Linus Torvalds	2	-0/+2
	Commit bfca3dd3d068 ("kernel/utsname_sysctl.c: print kernel arch") added a new entry to the uts_kern_table[] array, but didn't update the UTS_PROC_xyz enumerators of older entries, breaking anything that used them. Which is admittedly not many cases: it's really just the two uses of uts_proc_notify() in kernel/sys.c. But apparently journald-systemd actually uses this to detect hostname changes. Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Fixes: bfca3dd3d068 ("kernel/utsname_sysctl.c: print kernel arch") Link: https://lore.kernel.org/lkml/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/ Link: https://linux-regtracking.leemhuis.info/regzbot/regression/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/ Cc: Petr Vorel <pvorel@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-23	Merge tag 'perf_urgent_for_v6.1_rc2' of ↵	Linus Torvalds	5	-46/+163
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Fix raw data handling when perf events are used in bpf - Rework how SIGTRAPs get delivered to events to address a bunch of problems with it. Add a selftest for that too * tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: bpf: Fix sample_flags for bpf_perf_event_output selftests/perf_events: Add a SIGTRAP stress test with disables perf: Fix missing SIGTRAPs
2022-10-23	Merge tag 'sched_urgent_for_v6.1_rc2' of ↵	Linus Torvalds	4	-29/+35
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Adjust code to not trip up CFI - Fix sched group cookie matching * tag 'sched_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Introduce struct balance_callback to avoid CFI mismatches sched/core: Fix comparison in sched_group_cookie_match()
2022-10-23	Merge tag 'objtool_urgent_for_v6.1_rc2' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Borislav Petkov: - Fix ORC stack unwinding when GCOV is enabled * tag 'objtool_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/unwind/orc: Fix unreliable stack dump with gcov
2022-10-23	Merge tag 'x86_urgent_for_v6.0_rc2' of ↵	Linus Torvalds	11	-86/+122
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "As usually the case, right after a major release, the tip urgent branches accumulate a couple more fixes than normal. And here is the x86, a bit bigger, urgent pile. - Use the correct CPU capability clearing function on the error path in Intel perf LBR - A CFI fix to ftrace along with a simplification - Adjust handling of zero capacity bit mask for resctrl cache allocation on AMD - A fix to the AMD microcode loader to attempt patch application on every logical thread - A couple of topology fixes to handle CPUID leaf 0x1f enumeration info properly - Drop a -mabi=ms compiler option check as both compilers support it now anyway - A couple of fixes to how the initial, statically allocated FPU buffer state is setup and its interaction with dynamic states at runtime" * tag 'x86_urgent_for_v6.0_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly perf/x86/intel/lbr: Use setup_clear_cpu_cap() instead of clear_cpu_cap() ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph() x86/ftrace: Remove ftrace_epilogue() x86/resctrl: Fix min_cbm_bits for AMD x86/microcode/AMD: Apply the patch early on every logical thread x86/topology: Fix duplicated core ID within a package x86/topology: Fix multiple packages shown on a single-package system hwmon/coretemp: Handle large core ID value x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB x86/fpu: Exclude dynamic states from init_fpstate x86/fpu: Fix the init_fpstate size check with the actual size x86/fpu: Configure init_fpstate attributes orderly
2022-10-23	Merge tag 'io_uring-6.1-2022-10-22' of git://git.kernel.dk/linux	Linus Torvalds	4	-0/+7
	Pull io_uring follow-up from Jens Axboe: "Currently the zero-copy has automatic fallback to normal transmit, and it was decided that it'd be cleaner to return an error instead if the socket type doesn't support it. Zero-copy does work with UDP and TCP, it's more of a future proofing kind of thing (eg for samba)" * tag 'io_uring-6.1-2022-10-22' of git://git.kernel.dk/linux: io_uring/net: fail zc sendmsg when unsupported by socket io_uring/net: fail zc send when unsupported by socket net: flag sockets supporting msghdr originated zerocopy
2022-10-23	Merge tag 'hwmon-for-v6.1-rc2' of ↵	Linus Torvalds	3	-2/+8
	git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - corsair-psu: Fix typo in USB id description, and add USB ID for new PSU - pwm-fan: Fix fan power handling when disabling fan control * tag 'hwmon-for-v6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (corsair-psu) Add USB id of the new HX1500i psu hwmon: (pwm-fan) Explicitly switch off fan power when setting pwm1_enable to 0 hwmon: (corsair-psu) fix typo in USB id description
2022-10-23	Merge tag 'i2c-for-6.1-rc2' of ↵	Linus Torvalds	6	-16/+12
	git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "RPM fix for qcom-cci, platform module alias for xiic, build warning fix for mlxbf, typo fixes in comments" * tag 'i2c-for-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mlxbf: depend on ACPI; clean away ifdeffage i2c: fix spelling typos in comments i2c: qcom-cci: Fix ordering of pm_runtime_xx and i2c_add_adapter i2c: xiic: Add platform module alias
2022-10-23	Merge tag 'pci-v6.1-fixes-2' of ↵	Linus Torvalds	2	-5/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fixes from Bjorn Helgaas: - Revert a simplification that broke pci-tegra due to a masking error - Update MAINTAINERS for Kishon's email address change and TI DRA7XX/J721E maintainer change * tag 'pci-v6.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: MAINTAINERS: Update Kishon's email address in PCI endpoint subsystem MAINTAINERS: Add Vignesh Raghavendra as maintainer of TI DRA7XX/J721E PCI driver Revert "PCI: tegra: Use PCI_CONF1_EXT_ADDRESS() macro"
2022-10-23	Merge tag 'media/v6.1-2' of ↵	Linus Torvalds	132	-3873/+3193
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull missed media updates from Mauro Carvalho Chehab: "It seems I screwed-up my previous pull request: it ends up that only half of the media patches that were in linux-next got merged in -rc1. The script which creates the signed tags silently failed due to 5.19->6.0 so it ended generating a tag with incomplete stuff. So here are the missing parts: - a DVB core security fix - lots of fixes and cleanups for atomisp staging driver - old drivers that are VB1 are being moved to staging to be deprecated - several driver updates - mostly for embedded systems, but there are also some things addressing issues with some PC webcams, in the UVC video driver" * tag 'media/v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (163 commits) media: sun6i-csi: Move csi buffer definition to main header file media: sun6i-csi: Introduce and use video helper functions media: sun6i-csi: Add media ops with link notify callback media: sun6i-csi: Remove controls handler from the driver media: sun6i-csi: Register the media device after creation media: sun6i-csi: Pass and store csi device directly in video code media: sun6i-csi: Tidy up video code media: sun6i-csi: Tidy up v4l2 code media: sun6i-csi: Tidy up Kconfig media: sun6i-csi: Use runtime pm for clocks and reset media: sun6i-csi: Define and use variant to get module clock rate media: sun6i-csi: Always set exclusive module clock rate media: sun6i-csi: Tidy up platform code media: sun6i-csi: Refactor main driver data structures media: sun6i-csi: Define and use driver name and (reworked) description media: cedrus: Add a Kconfig dependency on RESET_CONTROLLER media: sun8i-rotate: Add a Kconfig dependency on RESET_CONTROLLER media: sun8i-di: Add a Kconfig dependency on RESET_CONTROLLER media: sun4i-csi: Add a Kconfig dependency on RESET_CONTROLLER media: sun6i-csi: Add a Kconfig dependency on RESET_CONTROLLER ...
2022-10-22	io_uring/net: fail zc sendmsg when unsupported by socket	Pavel Begunkov	1	-0/+2
	The previous patch fails zerocopy send requests for protocols that don't support it, do the same for zerocopy sendmsg. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0854e7bb4c3d810a48ec8b5853e2f61af36a0467.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22	io_uring/net: fail zc send when unsupported by socket	Pavel Begunkov	1	-0/+2
	If a protocol doesn't support zerocopy it will silently fall back to copying. This type of behaviour has always been a source of troubles so it's better to fail such requests instead. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2db3c7f16bb6efab4b04569cd16e6242b40c5cb3.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22	net: flag sockets supporting msghdr originated zerocopy	Pavel Begunkov	3	-0/+3
	We need an efficient way in io_uring to check whether a socket supports zerocopy with msghdr provided ubuf_info. Add a new flag into the struct socket flags fields. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/3dafafab822b1c66308bb58a0ac738b1e3f53f74.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22	hwmon: (corsair-psu) Add USB id of the new HX1500i psu	Wilken Gottwalt	2	-0/+3
	Also update the documentation accordingly. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net> Link: https://lore.kernel.org/r/Y0FghqQCHG/cX5Jz@monster.localdomain Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-10-22	tools: include: sync include/api/linux/kvm.h	Paolo Bonzini	1	-0/+1
	Provide a definition of KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Fixes: 17601bfed909 ("KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option") Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22	KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER	Alexander Graf	1	-0/+56
	The KVM_X86_SET_MSR_FILTER ioctls contains a pointer in the passed in struct which means it has a different struct size depending on whether it gets called from 32bit or 64bit code. This patch introduces compat code that converts from the 32bit struct to its 64bit counterpart which then gets used going forward internally. With this applied, 32bit QEMU can successfully set MSR bitmaps when running on 64bit kernels. Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com> Fixes: 1a155254ff937 ("KVM: x86: Introduce MSR filtering") Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-4-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22	KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter()	Alexander Graf	1	-14/+17
	In the next patch we want to introduce a second caller to set_msr_filter() which constructs its own filter list on the stack. Refactor the original function so it takes it as argument instead of reading it through copy_from_user(). Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-3-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22	kvm: Add support for arch compat vm ioctls	Alexander Graf	2	-0/+13
	We will introduce the first architecture specific compat vm ioctl in the next patch. Add all necessary boilerplate to allow architectures to override compat vm ioctls when necessary. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-2-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22	Merge tag 'kvm-riscv-fixes-6.1-1' of https://github.com/kvm-riscv/linux into ↵	Paolo Bonzini	6	-51/+57
	HEAD KVM/riscv fixes for 6.1, take #1 - Fix compilation without RISCV_ISA_ZICBOM - Fix kvm_riscv_vcpu_timer_pending() for Sstc
2022-10-22	Merge tag 'kvmarm-fixes-6.1-2' of ↵	Paolo Bonzini	2	-1/+8
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.1, take #2 - Fix a bug preventing restoring an ITS containing mappings for very large and very sparse device topology - Work around a relocation handling error when compiling the nVHE object with profile optimisation
2022-10-22	Merge tag 'kvmarm-fixes-6.1-1' of ↵	Paolo Bonzini	7	-33/+28
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.1, take #1 - Fix for stage-2 invalidation holding the VM MMU lock for too long by limiting the walk to the largest block mapping size - Enable stack protection and branch profiling for VHE - Two selftest fixes
2022-10-22	Merge tag 'thermal-6.1-rc2' of ↵	Linus Torvalds	1	-5/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "This fixes the control CPU selection in the intel_powerclamp thermal driver" * tag 'thermal-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: intel_powerclamp: Use first online CPU as control_cpu
2022-10-22	Merge tag 'pm-6.1-rc2' of ↵	Linus Torvalds	5	-25/+20
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix some issues and clean up code in ARM cpufreq drivers. Specifics: - Fix module loading in the Tegra124 cpufreq driver (Jon Hunter) - Fix memory leak and update to read-only region in the qcom cpufreq driver (Fabien Parent) - Miscellaneous minor cleanups to cpufreq drivers (Fabien Parent, Yang Yingliang)" * tag 'pm-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: sun50i: Switch to use dev_err_probe() helper cpufreq: qcom-nvmem: Switch to use dev_err_probe() helper cpufreq: imx6q: Switch to use dev_err_probe() helper cpufreq: dt: Switch to use dev_err_probe() helper cpufreq: qcom: remove unused parameter in function definition cpufreq: qcom: fix writes in read-only memory region cpufreq: qcom: fix memory leak in error path cpufreq: tegra194: Fix module loading
2022-10-22	Merge tag 'acpi-6.1-rc2' of ↵	Linus Torvalds	7	-35/+60
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix issues introduced during this merge window (ACPI/PCI, device enumeration and documentation) and some other ones found recently. Specifics: - Add missing device reference counting to acpi_get_pci_dev() after changing it recently (Rafael Wysocki) - Fix resource list walk in acpi_dma_get_range() (Robin Murphy) - Add IRQ override quirk for LENOVO IdeaPad and extend the IRQ override warning message (Jiri Slaby) - Fix integer overflow in ghes_estatus_pool_init() (Ashish Kalra) - Fix multiple error records handling in one of the ACPI extlog driver code paths (Tony Luck) - Prune DSDT override documentation from index after dropping it (Bagas Sanjaya)" * tag 'acpi-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: scan: Fix DMA range assignment ACPI: PCI: Fix device reference counting in acpi_get_pci_dev() ACPI: resource: note more about IRQ override ACPI: resource: do IRQ override on LENOVO IdeaPad ACPI: extlog: Handle multiple records ACPI: APEI: Fix integer overflow in ghes_estatus_pool_init() Documentation: ACPI: Prune DSDT override documentation from index
2022-10-22	Merge tag 'efi-fixes-for-v6.1-1' of ↵	Linus Torvalds	11	-81/+22
	git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - fixes for the EFI variable store refactor that landed in v6.0 - fixes for issues that were introduced during the merge window - back out some changes related to EFI zboot signing - we'll add a better solution for this during the next cycle * tag 'efi-fixes-for-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: runtime: Don't assume virtual mappings are missing if VA == PA == 0 efi: libstub: Fix incorrect payload size in zboot header efi: libstub: Give efi_main() asmlinkage qualification efi: efivars: Fix variable writes without query_variable_store() efi: ssdt: Don't free memory if ACPI table was loaded successfully efi: libstub: Remove zboot signing from build options
2022-10-22	Merge tag 'iommu-fixes-v6.1-rc1' of ↵	Linus Torvalds	11	-21/+37
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: "Intel VT-d fixes: - Fix a lockdep splat issue in intel_iommu_init() - Allow NVS regions to pass RMRR check - Domain cleanup in error path" * tag 'iommu-fixes-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Clean up si_domain in the init_dmars() error path iommu/vt-d: Allow NVS regions in arch_rmrr_sanity_check() iommu/vt-d: Use rcu_lock in get_resv_regions iommu: Add gfp parameter to iommu_alloc_resv_region
2022-10-22	Merge tag 'for-linus-2022102101' of ↵	Linus Torvalds	6	-9/+83
	git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - a 12 year old bug fix for the Apple Magic Trackpad v1 (José Expósito) - a fix for a potential crash on removal of the Playstation controllers (Roderick Colenbrander) - a few new device IDs and device-specific quirks, most notably support of the new Playstation DualSense Edge controller * tag 'for-linus-2022102101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: lenovo: Make array tp10ubkbd_led static const HID: saitek: add madcatz variant of MMO7 mouse device ID HID: playstation: support updated DualSense rumble mode. HID: playstation: add initial DualSense Edge controller support HID: playstation: stop DualSense output work on remove. HID: magicmouse: Do not set BTN_MOUSE on double report
2022-10-22	Merge tag '6.1-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds	11	-32/+68
	Pull cifs fixes from Steve French: - memory leak fixes - fixes for directory leases, including an important one which fixes a problem noticed by git functional tests - fixes relating to missing free_xid calls (helpful for tracing/debugging of entry/exit into cifs.ko) - a multichannel fix - a small cleanup fix (use of list_move instead of list_del/list_add) * tag '6.1-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module number cifs: fix memory leaks in session setup cifs: drop the lease for cached directories on rmdir or rename smb3: interface count displayed incorrectly cifs: Fix memory leak when build ntlmssp negotiate blob failed cifs: set rc to -ENOENT if we can not get a dentry for the cached dir cifs: use LIST_HEAD() and list_move() to simplify code cifs: Fix xid leak in cifs_get_file_info_unix() cifs: Fix xid leak in cifs_ses_add_channel() cifs: Fix xid leak in cifs_flock() cifs: Fix xid leak in cifs_copy_file_range() cifs: Fix xid leak in cifs_create()