<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/idle, branch v6.6.39</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.39</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.39'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-01-25T23:35:12+00:00</updated>
<entry>
<title>x86: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram</title>
<updated>2024-01-25T23:35:12+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-11-15T15:13:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=08beb0d4362b3d3737c65de55f45ca6b7e52c71a'/>
<id>urn:sha1:08beb0d4362b3d3737c65de55f45ca6b7e52c71a</id>
<content type='text'>
[ Upstream commit edc8fc01f608108b0b7580cb2c29dfb5135e5f0e ]

intel_idle_irq() re-enables IRQs very early. As a result, an interrupt
may fire before mwait() is eventually called. If such an interrupt queues
a timer, it may go unnoticed until mwait returns and the idle loop
handles the tick re-evaluation. And monitoring TIF_NEED_RESCHED doesn't
help because a local timer enqueue doesn't set that flag.

The issue is mitigated by the fact that this idle handler is only invoked
for shallow C-states when, presumably, the next tick is supposed to be
close enough. There may still be rare cases though when the next tick
is far away and the selected C-state is shallow, resulting in a timer
getting ignored for a while.

Fix this with using sti_mwait() whose IRQ-reenablement only triggers
upon calling mwait(), dealing with the race while keeping the interrupt
latency within acceptable bounds.

Fixes: c227233ad64c (intel_idle: enable interrupts before C1 on Xeons)
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Link: https://lkml.kernel.org/r/20231115151325.6262-3-frederic@kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'perf-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2023-08-28T23:35:01+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-08-28T23:35:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1a7c611546e552193180941ecf6b191e659db979'/>
<id>urn:sha1:1a7c611546e552193180941ecf6b191e659db979</id>
<content type='text'>
Pull perf event updates from Ingo Molnar:

 - AMD IBS improvements

 - Intel PMU driver updates

 - Extend core perf facilities &amp; the ARM PMU driver to better handle ARM big.LITTLE events

 - Micro-optimize software events and the ring-buffer code

 - Misc cleanups &amp; fixes

* tag 'perf-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/uncore: Remove unnecessary ?: operator around pcibios_err_to_errno() call
  perf/x86/intel: Add Crestmont PMU
  x86/cpu: Update Hybrids
  x86/cpu: Fix Crestmont uarch
  x86/cpu: Fix Gracemont uarch
  perf: Remove unused extern declaration arch_perf_get_page_size()
  perf: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
  arm_pmu: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
  perf/x86: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
  arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability
  perf/x86/ibs: Set mem_lvl_num, mem_remote and mem_hops for data_src
  perf/mem: Add PERF_MEM_LVLNUM_NA to PERF_MEM_NA
  perf/mem: Introduce PERF_MEM_LVLNUM_UNC
  perf/ring_buffer: Use local_try_cmpxchg in __perf_output_begin
  locking/arch: Avoid variable shadowing in local_try_cmpxchg()
  perf/core: Use local64_try_cmpxchg in perf_swevent_set_period
  perf/x86: Use local64_try_cmpxchg
  perf/amd: Prevent grouping of IBS events
</content>
</entry>
<entry>
<title>x86/cpu: Fix Gracemont uarch</title>
<updated>2023-08-09T19:51:06+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-08-07T12:38:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=882cdb06b668488a42ef717a260c05ba7dc43a49'/>
<id>urn:sha1:882cdb06b668488a42ef717a260c05ba7dc43a49</id>
<content type='text'>
Alderlake N is an E-core only product using Gracemont
micro-architecture. It fits the pre-existing naming scheme perfectly
fine, adhere to it.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Hans de Goede &lt;hdegoede@redhat.com&gt;
Link: https://lore.kernel.org/r/20230807150405.686834933@infradead.org
</content>
</entry>
<entry>
<title>Revert "intel_idle: Add support for using intel_idle in a VM guest using just hlt"</title>
<updated>2023-07-19T18:10:03+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2023-07-19T18:10:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5534f44627418897cd901d725303ce3dedd7bc1e'/>
<id>urn:sha1:5534f44627418897cd901d725303ce3dedd7bc1e</id>
<content type='text'>
This reverts commit 2f3d08f074b0 ("intel_idle: Add support for using
intel_idle in a VM guest using just hlt"), because it causes functional
issues to appear and it is not really useful without a related commit
that got reverted previously.

Link: https://lore.kernel.org/linux-pm/5c7de6d5-7706-c4a5-7c41-146db1269aff@intel.com
Reported-by: Xiaoyao Li &lt;xiaoyao.li@intel.com&gt;
Requested-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>Revert "intel_idle: Add a "Long HLT" C1 state for the VM guest mode"</title>
<updated>2023-07-19T18:05:43+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2023-07-19T17:57:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a5155c023d6a2c9ff5f4fad5f75117578a58affc'/>
<id>urn:sha1:a5155c023d6a2c9ff5f4fad5f75117578a58affc</id>
<content type='text'>
This reverts commit 0fac214bb75e ("intel_idle: Add a "Long HLT" C1 state
for the VM guest mode"), because there is a coding mistake in it and its
validity is questioned.

Link: https://lore.kernel.org/all/20230711132553.GN3062772@hirez.programming.kicks-ass.net
Requested-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>Revert "intel_idle: Add __init annotation to matchup_vm_state_with_baremetal()"</title>
<updated>2023-07-19T17:56:08+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2023-07-19T17:56:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d46b0a05bdc849d39c75268ddaf654c59bd6785c'/>
<id>urn:sha1:d46b0a05bdc849d39c75268ddaf654c59bd6785c</id>
<content type='text'>
This reverts commit b2918089d5cb ("intel_idle: Add __init annotation to
matchup_vm_state_with_baremetal()"), because the commit fixed by it will
be reverted.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>intel_idle: Add __init annotation to matchup_vm_state_with_baremetal()</title>
<updated>2023-06-28T17:09:55+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2023-06-27T17:50:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b2918089d5cb452e928ad9f86c5ee601c592cee3'/>
<id>urn:sha1:b2918089d5cb452e928ad9f86c5ee601c592cee3</id>
<content type='text'>
The caller of (recently added) matchup_vm_state_with_baremetal() is an
__init function and it uses some __initdata data structures, so add the
__init annotation to it for consistency.

This addresses the following build warnings:

WARNING: modpost: vmlinux: section mismatch in reference: matchup_vm_state_with_baremetal+0x51 (section: .text) -&gt; intel_idle_max_cstate_reached (section: .init.text)
WARNING: modpost: vmlinux: section mismatch in reference: matchup_vm_state_with_baremetal+0x62 (section: .text) -&gt; cpuidle_state_table (section: .init.data)
WARNING: modpost: vmlinux: section mismatch in reference: matchup_vm_state_with_baremetal+0x79 (section: .text) -&gt; icpu (section: .init.data)

Fixes: 0fac214bb75e ("intel_idle: Add a "Long HLT" C1 state for the VM guest mode")
Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Tested-by: Randy Dunlap &lt;rdunlap@infradead.org&gt; # build-tested
Reviewed-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
</content>
</entry>
<entry>
<title>intel_idle: Add a "Long HLT" C1 state for the VM guest mode</title>
<updated>2023-06-21T17:46:58+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2023-06-20T18:03:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0fac214bb75ee64d7b9e6521d53f8251a4d324ea'/>
<id>urn:sha1:0fac214bb75ee64d7b9e6521d53f8251a4d324ea</id>
<content type='text'>
intel_idle will, for the bare metal case, usually have one or more deep
power states that have the CPUIDLE_FLAG_TLB_FLUSHED flag set. When
a state with this flag is selected by the cpuidle framework, it will also
flush the TLBs as part of entering this state. The benefit of doing this is
that the kernel does not need to wake the cpu out of this deep power state
just to flush the TLBs... for which the latency can be very high due to
the exit latency of deep power states.

In a VM guest currently, this benefit of avoiding the wakeup does not exist,
while the problem (long exit latency) is even more severe. Linux will need
to wake up a vCPU (causing the host to either come out of a deep C state,
or the VMM to have to deschedule something else to schedule the vCPU) which
can take a very long time.. adding a lot of latency to tlb flush operations
(including munmap and others).

To solve this, add a "Long HLT" C state to the state table for the VM guest
case that has the CPUIDLE_FLAG_TLB_FLUSHED flag set.  The result of that is
that for long idle periods (where the VMM is likely to do things that cause
large latency) the cpuidle framework will flush the TLBs (and avoid the
wakeups), while for short/quick idle durations, the existing behavior is
retained.

Now, there is still only "hlt" available in the guest, but for long idle,
the host can go to a deeper state (say C6).  There is a reasonable debate
one can have to what to set for the exit_latency and break even point for
this "Long HLT" state.  The good news is that intel_idle has these values
available for the underlying CPU (even when mwait is not exposed).  The
solution thus is to just use the latency and break even of the deepest state
from the bare metal CPU.  This is under the assumption that this is a pretty
reasonable estimate of what the VMM would do to cause latency.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>intel_idle: Add support for using intel_idle in a VM guest using just hlt</title>
<updated>2023-06-16T17:12:36+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2023-06-05T15:47:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2f3d08f074b02aa449de27238fda72496c789034'/>
<id>urn:sha1:2f3d08f074b02aa449de27238fda72496c789034</id>
<content type='text'>
In a typical VM guest, the mwait instruction is not available, leaving
only the 'hlt' instruction (which causes a VMEXIT to the host).

So for this common case, intel_idle will detect the lack of mwait, and
fail to initialize (after which another idle method would step in which
will just use hlt always).

Other (non-common) cases exist; the table below shows the before/after
for these:

+------------+--------------------------+-------------------------+
| Hypervisor | Idle method before patch | Idle method after patch |
| exposes    |                          |                         |
+============+==========================+=========================+
| nothing    | default_idle fallback    | intel_idle VM table     |
| (common)   | (straight "hlt")         |                         |
+------------+--------------------------+-------------------------+
| mwait      | intel_idle mwait table   | intel_idle mwait table  |
+------------+--------------------------+-------------------------+
| ACPI       | ACPI C1 state ("hlt")    | intel_idle VM table     |
+------------+--------------------------+-------------------------+

This is only applicable to CPUs known by intel_idle. For the bare metal
case, unknown CPU models will use the ACPI tables (when available) to
get estimates for latency and break even point for longer idle states.
In guests, the common case is that ACPI tables are not available, but
even when they are available, they can't and don't provide the latency
information for the longer (mwait based) states. For this scenario
(unknown CPU model), the default_idle mode (no ACPI) or ACPI C1 (ACPI
avaible) will be used.

By providing capability to do this with the intel_idle driver, we can
do better than the fallback or ACPI table methods. While this current
change only gets us to the existing behavior, later patches in this
series will add new capabilities such as optimized TLB flushing.

In order to do this, a simplified version of the initialization
function for VM guests is created, and this will be called if the CPU
is recognized, but mwait is not supported, and we're in a VM guest.

One thing to note is that the max latency (and break even) of this C1
state is higher than the typical bare metal C1 state. Because hlt causes
a vmexit, and the cost of vmexit + hypervisor overhead + vmenter is
typically in the order of upto 5 microseconds... even if the hypervisor
does not actually goes into a hardware power saving state.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
[ rjw: Dropped redundant checks from should_verify_mwait() ]
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>intel_idle: clean up the (new) state_update_enter_method function</title>
<updated>2023-06-12T18:07:22+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2023-06-05T15:47:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7826c069c8765969cfb7a364f8e0e663fc132b10'/>
<id>urn:sha1:7826c069c8765969cfb7a364f8e0e663fc132b10</id>
<content type='text'>
Now that the logic for state_update_enter_method() is in its own
function, the long if .. else if .. else if .. else if chain
can be simplified by just returning from the function
at the various places. This does not change functionality,
but it makes the logic much simpler to read or modify later.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
</feed>
