summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-12-06configs: Enalbe PREEMPT_RT and set 1000 timer interrupt.JH7110_VF2_6.6_v5.13.2_RTLINUXrt-linux-6.6.y-releaseMinda Chen1-4/+5
Enable RT-Linux ,set 1000 timer interrupts. And set gmac to module.(Ethercat) Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
2024-12-06riscv: Fix return error in cpufeature.cMinda Chen1-1/+1
Fix return error in cpufeature.c Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
2024-12-06Linux 6.6.20-rt25 REBASEClark Williams1-1/+1
Signed-off-by: Clark Williams <clark.williams@gmail.com>
2024-12-06printk: nbcon: move locked_port flag to struct uart_portJunxiao Chang3-6/+5
Console pointer in uart_port might be shared among multiple uart ports. Flag port locked by nbcon should be saved in uart_port structure instead of in console structure. Fixes: 6424f396c49e ("printk: nbcon: Implement processing in port->lock wrapper") Suggested-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/all/20240123054033.183114-2-junxiao.chang@intel.com (cherry picked from commit d4fb86a96cb4a1efd24ca13a2ac234a1c9a3fdc5) Signed-off-by: Clark Williams <clark.williams@gmail.com>
2024-12-06arm: Disable FAST_GUP on PREEMPT_RT if HIGHPTE is also enabled.Sebastian Andrzej Siewior1-1/+1
gup_pgd_range() is invoked with disabled interrupts and invokes __kmap_local_page_prot() via pte_offset_map(), gup_p4d_range(). With HIGHPTE enabled, __kmap_local_page_prot() invokes kmap_high_get() which uses a spinlock_t via lock_kmap_any(). This leads to an sleeping-while-atomic error on PREEMPT_RT because spinlock_t becomes a sleeping lock and must not be acquired in atomic context. The loop in map_new_virtual() uses wait_queue_head_t for wake up which also is using a spinlock_t. Limit HAVE_FAST_GUP additionaly to remain disabled on PREEMPT_RT with HIGHPTE enabled. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> (cherry picked from commit 02cf5a345530b4d3a94093f0b5c784701c2e7c6a) Signed-off-by: Clark Williams <clark.williams@gmail.com>
2024-12-06Linux 6.6.18-rt23 REBASEClark Williams1-1/+1
Signed-off-by: Clark Williams <clark.williams@gmail.com>
2024-12-06Add localversion for -RT releaseThomas Gleixner1-0/+1
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06Revert "preempt: Put preempt_enable() within an instrumentation*() section."Clark Williams1-8/+2
This reverts commit cc3d27d9fdeddcb82db3ea176a44a5509e70eb1c. This code was fixed in 6.6 stable so no need for it in the RT series Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Clark Williams <williams@redhat.com>
2024-12-06arch/riscv: check_unaligned_acces(): don't alloc page for checkClark Williams1-6/+0
Drop the alloc_pages() call since the page is passed in as a parameter and the alloced page will not be freed. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Clark Williams <clark.williams@gmail.com>
2024-12-06sysfs: Add /sys/kernel/realtime entryClark Williams1-0/+12
Add a /sys/kernel entry to indicate that the kernel is a realtime kernel. Clark says that he needs this for udev rules, udev needs to evaluate if its a PREEMPT_RT kernel a few thousand times and parsing uname output is too slow or so. Are there better solutions? Should it exist and return 0 on !-rt? Signed-off-by: Clark Williams <williams@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06riscv: allow to enable RTJisheng Zhang1-0/+1
Now, it's ready to enable RT on riscv. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06riscv: add PREEMPT_AUTO supportJisheng Zhang2-0/+3
riscv has switched to GENERIC_ENTRY, so adding PREEMPT_AUTO is as simple as adding TIF_ARCH_RESCHED_LAZY related definitions and enabling HAVE_PREEMPT_AUTO. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06RISC-V: Probe misaligned access speed in parallelEvan Green3-11/+76
Probing for misaligned access speed takes about 0.06 seconds. On a system with 64 cores, doing this in smp_callin() means it's done serially, extending boot time by 3.8 seconds. That's a lot of boot time. Instead of measuring each CPU serially, let's do the measurements on all CPUs in parallel. If we disable preemption on all CPUs, the jiffies stop ticking, so we can do this in stages of 1) everybody except core 0, then 2) core 0. The allocations are all done outside of on_each_cpu() to avoid calling alloc_pages() with interrupts disabled. For hotplugged CPUs that come in after the boot time measurement, register CPU hotplug callbacks, and do the measurement there. Interrupts are enabled in those callbacks, so they're fine to do alloc_pages() in. [bigeasy: merge the individual patches into the final step.] Reported-by: Jisheng Zhang <jszhang@kernel.org> Closes: https://lore.kernel.org/all/mhng-9359993d-6872-4134-83ce-c97debe1cf9a@palmer-ri-x1c9/T/#mae9b8f40016f9df428829d33360144dc5026bcbf Fixes: 584ea6564bca ("RISC-V: Probe for unaligned access speed") Signed-off-by: Evan Green <evan@rivosinc.com> Link: https://lore.kernel.org/r/20231106225855.3121724-1-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06POWERPC: Allow to enable RTSebastian Andrzej Siewior1-0/+2
Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06powerpc/stackprotector: work around stack-guard init from atomicSebastian Andrzej Siewior1-1/+6
This is invoked from the secondary CPU in atomic context. On x86 we use tsc instead. On Power we XOR it against mftb() so lets use stack address as the initial value. Cc: stable-rt@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06powerpc/kvm: Disable in-kernel MPIC emulation for PREEMPT_RTBogdan Purcareata1-0/+1
While converting the openpic emulation code to use a raw_spinlock_t enables guests to run on RT, there's still a performance issue. For interrupts sent in directed delivery mode with a multiple CPU mask, the emulated openpic will loop through all of the VCPUs, and for each VCPUs, it call IRQ_check, which will loop through all the pending interrupts for that VCPU. This is done while holding the raw_lock, meaning that in all this time the interrupts and preemption are disabled on the host Linux. A malicious user app can max both these number and cause a DoS. This temporary fix is sent for two reasons. First is so that users who want to use the in-kernel MPIC emulation are aware of the potential latencies, thus making sure that the hardware MPIC and their usage scenario does not involve interrupts sent in directed delivery mode, and the number of possible pending interrupts is kept small. Secondly, this should incentivize the development of a proper openpic emulation that would be better suited for RT. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06powerpc/pseries: Select the generic memory allocator.Sebastian Andrzej Siewior1-0/+1
The RTAS work area allocator is using the generic memory allocator and as such it must select it. Select the generic memory allocator on pseries. Fixes: 43033bc62d349 ("powerpc/pseries: add RTAS work area allocator") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/20230309135110.uAxhqRFk@linutronix.de
2024-12-06powerpc/pseries/iommu: Use a locallock instead local_irq_save()Sebastian Andrzej Siewior1-11/+20
The locallock protects the per-CPU variable tce_page. The function attempts to allocate memory while tce_page is protected (by disabling interrupts). Use local_irq_save() instead of local_irq_disable(). Cc: stable-rt@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06powerpc: traps: Use PREEMPT_RTSebastian Andrzej Siewior1-1/+6
Add PREEMPT_RT to the backtrace if enabled. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06ARM64: Allow to enable RTSebastian Andrzej Siewior1-0/+1
Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06ARM: Allow to enable RTSebastian Andrzej Siewior1-0/+2
Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06ARM: vfp: Move sending signals outside of vfp_lock()ed section.Sebastian Andrzej Siewior1-11/+18
VFP_bounce() is invoked from within vfp_support_entry() and may send a signal. Sending a signal uses spinlock_t which becomes a sleeping lock on PREEMPT_RT and must not be acquired within a preempt-disabled section. Move the vfp_raise_sigfpe() block outside of the vfp_lock() section. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06ARM: vfp: Use vfp_lock() in vfp_support_entry().Sebastian Andrzej Siewior1-3/+3
vfp_entry() is invoked from exception handler and is fully preemptible. It uses local_bh_disable() to remain uninterrupted while checking the VFP state. This is not working on PREEMPT_RT because local_bh_disable() synchronizes the relevant section but the context remains fully preemptible. Use vfp_lock() for uninterrupted access. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06ARM: vfp: Use vfp_lock() in vfp_sync_hwstate().Sebastian Andrzej Siewior1-6/+3
vfp_sync_hwstate() uses preempt_disable() followed by local_bh_disable() to ensure that it won't get interrupted while checking the VFP state. This harms PREEMPT_RT because softirq handling can get preempted and local_bh_disable() synchronizes the related section with a sleeping lock which does not work with disabled preemption. Use the vfp_lock() to synchronize the access. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06ARM: vfp: Provide vfp_lock() for VFP locking.Sebastian Andrzej Siewior1-2/+30
kernel_neon_begin() uses local_bh_disable() to ensure exclusive access to the VFP unit. This is broken on PREEMPT_RT because a BH disabled section remains preemptible on PREEMPT_RT. Introduce vfp_lock() which uses local_bh_disable() and preempt_disable() on PREEMPT_RT. Since softirqs are processed always in thread context, disabling preemption is enough to ensure that the current context won't get interrupted by something that is using the VFP. Use it in kernel_neon_begin(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06tty/serial/pl011: Make the locking work on RTThomas Gleixner1-8/+4
The lock is a sleeping lock and local_irq_save() is not the optimsation we are looking for. Redo it to make it work on -RT and non-RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06tty/serial/omap: Make the locking RT awareThomas Gleixner1-8/+4
The lock is a sleeping lock and local_irq_save() is not the optimsation we are looking for. Redo it to make it work on -RT and non-RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06ARM: enable irq in translation/section permission fault handlersYadi.hu1-0/+6
Probably happens on all ARM, with CONFIG_PREEMPT_RT CONFIG_DEBUG_ATOMIC_SLEEP This simple program.... int main() { *((char*)0xc0001000) = 0; }; [ 512.742724] BUG: sleeping function called from invalid context at kernel/rtmutex.c:658 [ 512.743000] in_atomic(): 0, irqs_disabled(): 128, pid: 994, name: a [ 512.743217] INFO: lockdep is turned off. [ 512.743360] irq event stamp: 0 [ 512.743482] hardirqs last enabled at (0): [< (null)>] (null) [ 512.743714] hardirqs last disabled at (0): [<c0426370>] copy_process+0x3b0/0x11c0 [ 512.744013] softirqs last enabled at (0): [<c0426370>] copy_process+0x3b0/0x11c0 [ 512.744303] softirqs last disabled at (0): [< (null)>] (null) [ 512.744631] [<c041872c>] (unwind_backtrace+0x0/0x104) [ 512.745001] [<c09af0c4>] (dump_stack+0x20/0x24) [ 512.745355] [<c0462490>] (__might_sleep+0x1dc/0x1e0) [ 512.745717] [<c09b6770>] (rt_spin_lock+0x34/0x6c) [ 512.746073] [<c0441bf0>] (do_force_sig_info+0x34/0xf0) [ 512.746457] [<c0442668>] (force_sig_info+0x18/0x1c) [ 512.746829] [<c041d880>] (__do_user_fault+0x9c/0xd8) [ 512.747185] [<c041d938>] (do_bad_area+0x7c/0x94) [ 512.747536] [<c041d990>] (do_sect_fault+0x40/0x48) [ 512.747898] [<c040841c>] (do_DataAbort+0x40/0xa0) [ 512.748181] Exception stack(0xecaa1fb0 to 0xecaa1ff8) Oxc0000000 belongs to kernel address space, user task can not be allowed to access it. For above condition, correct result is that test case should receive a “segment fault” and exits but not stacks. the root cause is commit 02fe2845d6a8 ("avoid enabling interrupts in prefetch/data abort handlers"),it deletes irq enable block in Data abort assemble code and move them into page/breakpiont/alignment fault handlers instead. But author does not enable irq in translation/section permission fault handlers. ARM disables irq when it enters exception/ interrupt mode, if kernel doesn't enable irq, it would be still disabled during translation/section permission fault. We see the above splat because do_force_sig_info is still called with IRQs off, and that code eventually does a: spin_lock_irqsave(&t->sighand->siglock, flags); As this is architecture independent code, and we've not seen any other need for other arch to have the siglock converted to raw lock, we can conclude that we should enable irq for ARM translation/section permission exception. Signed-off-by: Yadi.hu <yadi.hu@windriver.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06arm: Disable jump-label on PREEMPT_RT.Thomas Gleixner1-1/+1
jump-labels are used to efficiently switch between two possible code paths. To achieve this, stop_machine() is used to keep the CPU in a known state while the opcode is modified. The usage of stop_machine() here leads to large latency spikes which can be observed on PREEMPT_RT. Jump labels may change the target during runtime and are not restricted to debug or "configuration/ setup" part of a PREEMPT_RT system where high latencies could be defined as acceptable. Disable jump-label support on a PREEMPT_RT system. [bigeasy: Patch description.] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lkml.kernel.org/r/20220613182447.112191-2-bigeasy@linutronix.de
2024-12-06sched: define TIF_ALLOW_RESCHEDThomas Gleixner20-56/+171
On Fri, Sep 22 2023 at 00:55, Thomas Gleixner wrote: > On Thu, Sep 21 2023 at 09:00, Linus Torvalds wrote: >> That said - I think as a proof of concept and "look, with this we get >> the expected scheduling event counts", that patch is perfect. I think >> you more than proved the concept. > > There is certainly quite some analyis work to do to make this a one to > one replacement. > > With a handful of benchmarks the PoC (tweaked with some obvious fixes) > is pretty much on par with the current mainline variants (NONE/FULL), > but the memtier benchmark makes a massive dent. > > It sports a whopping 10% regression with the LAZY mode versus the mainline > NONE model. Non-LAZY and FULL behave unsurprisingly in the same way. > > That benchmark is really sensitive to the preemption model. With current > mainline (DYNAMIC_PREEMPT enabled) the preempt=FULL model has ~20% > performance drop versus preempt=NONE. That 20% was a tired pilot error. The real number is in the 5% ballpark. > I have no clue what's going on there yet, but that shows that there is > obviously quite some work ahead to get this sorted. It took some head scratching to figure that out. The initial fix broke the handling of the hog issue, i.e. the problem that Ankur tried to solve, but I hacked up a "solution" for that too. With that the memtier benchmark is roughly back to the mainline numbers, but my throughput benchmark know how is pretty close to zero, so that should be looked at by people who actually understand these things. Likewise the hog prevention is just at the PoC level and clearly beyond my knowledge of scheduler details: It unconditionally forces a reschedule when the looping task is not responding to a lazy reschedule request before the next tick. IOW it forces a reschedule on the second tick, which is obviously different from the cond_resched()/might_sleep() behaviour. The changes vs. the original PoC aside of the bug and thinko fixes: 1) A hack to utilize the TRACE_FLAG_IRQS_NOSUPPORT flag to trace the lazy preempt bit as the trace_entry::flags field is full already. That obviously breaks the tracer ABI, but if we go there then this needs to be fixed. Steven? 2) debugfs file to validate that loops can be force preempted w/o cond_resched() The usage is: # taskset -c 1 bash # echo 1 > /sys/kernel/debug/sched/hog & # echo 1 > /sys/kernel/debug/sched/hog & # echo 1 > /sys/kernel/debug/sched/hog & top shows ~33% CPU for each of the hogs and tracing confirms that the crude hack in the scheduler tick works: bash-4559 [001] dlh2. 2253.331202: resched_curr <-__update_curr bash-4560 [001] dlh2. 2253.340199: resched_curr <-__update_curr bash-4561 [001] dlh2. 2253.346199: resched_curr <-__update_curr bash-4559 [001] dlh2. 2253.353199: resched_curr <-__update_curr bash-4561 [001] dlh2. 2253.358199: resched_curr <-__update_curr bash-4560 [001] dlh2. 2253.370202: resched_curr <-__update_curr bash-4559 [001] dlh2. 2253.378198: resched_curr <-__update_curr bash-4561 [001] dlh2. 2253.389199: resched_curr <-__update_curr The 'l' instead of the usual 'N' reflects that the lazy resched bit is set. That makes __update_curr() invoke resched_curr() instead of the lazy variant. resched_curr() sets TIF_NEED_RESCHED and folds it into preempt_count so that preemption happens at the next possible point, i.e. either in return from interrupt or at the next preempt_enable(). That's as much as I wanted to demonstrate and I'm not going to spend more cycles on it as I have already too many other things on flight and the resulting scheduler woes are clearly outside of my expertice. Though definitely I'm putting a permanent NAK in place for any attempts to duct tape the preempt=NONE model any further by sprinkling more cond*() and whatever warts around. Thanks, tglx [tglx: s@CONFIG_PREEMPT_AUTO@CONFIG_PREEMPT_BUILD_AUTO@ ] Link: https://lore.kernel.org/all/87jzshhexi.ffs@tglx/ Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06Revert "drm/i915: Depend on !PREEMPT_RT."Sebastian Andrzej Siewior1-1/+0
Once the known issues are addressed, it should be safe to enable the driver. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06drm/i915/guc: Consider also RCU depth in busy loop.Sebastian Andrzej Siewior1-1/+1
intel_guc_send_busy_loop() looks at in_atomic() and irqs_disabled() to decide if it should busy-spin while waiting or if it may sleep. Both checks will report false on PREEMPT_RT if sleeping spinlocks are acquired leading to RCU splats while the function sleeps. Check also if RCU has been disabled. Reported-by: "John B. Wyatt IV" <jwyatt@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06drm/i915: Do not disable preemption for resetsTvrtko Ursulin1-7/+5
Commit ade8a0f59844 ("drm/i915: Make all GPU resets atomic") added a preempt disable section over the hardware reset callback to prepare the driver for being able to reset from atomic contexts. In retrospect I can see that the work item at a time was about removing the struct mutex from the reset path. Code base also briefly entertained the idea of doing the reset under stop_machine in order to serialize userspace mmap and temporary glitch in the fence registers (see eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex"), but that never materialized and was soon removed in 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset") and replaced with a SRCU based solution. As such, as far as I can see, today we still have a requirement that resets must not sleep (invoked from submission tasklets), but no need to support invoking them from a truly atomic context. Given that the preemption section is problematic on RT kernels, since the uncore lock becomes a sleeping lock and so is invalid in such section, lets try and remove it. Potential downside is that our short waits on GPU to complete the reset may get extended if CPU scheduling interferes, but in practice that probably isn't a deal breaker. In terms of mechanics, since the preemption disabled block is being removed we just need to replace a few of the wait_for_atomic macros into busy looping versions which will work (and not complain) when called from non-atomic sections. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20230705093025.3689748-1-tvrtko.ursulin@linux.intel.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06drm/i915: Drop the irqs_disabled() checkSebastian Andrzej Siewior1-2/+0
The !irqs_disabled() check triggers on PREEMPT_RT even with i915_sched_engine::lock acquired. The reason is the lock is transformed into a sleeping lock on PREEMPT_RT and does not disable interrupts. There is no need to check for disabled interrupts. The lockdep annotation below already check if the lock has been acquired by the caller and will yell if the interrupts are not disabled. Remove the !irqs_disabled() check. Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06drm/i915/gt: Use spin_lock_irq() instead of local_irq_disable() + spin_lock()Sebastian Andrzej Siewior1-12/+5
execlists_dequeue() is invoked from a function which uses local_irq_disable() to disable interrupts so the spin_lock() behaves like spin_lock_irq(). This breaks PREEMPT_RT because local_irq_disable() + spin_lock() is not the same as spin_lock_irq(). execlists_dequeue_irq() and execlists_dequeue() has each one caller only. If intel_engine_cs::active::lock is acquired and released with the _irq suffix then it behaves almost as if execlists_dequeue() would be invoked with disabled interrupts. The difference is the last part of the function which is then invoked with enabled interrupts. I can't tell if this makes a difference. From looking at it, it might work to move the last unlock at the end of the function as I didn't find anything that would acquire the lock again. Reported-by: Clark Williams <williams@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2024-12-06drm/i915/gt: Queue and wait for the irq_work item.Sebastian Andrzej Siewior1-3/+2
Disabling interrupts and invoking the irq_work function directly breaks on PREEMPT_RT. PREEMPT_RT does not invoke all irq_work from hardirq context because some of the user have spinlock_t locking in the callback function. These locks are then turned into a sleeping locks which can not be acquired with disabled interrupts. Using irq_work_queue() has the benefit that the irqwork will be invoked in the regular context. In general there is "no" delay between enqueuing the callback and its invocation because the interrupt is raised right away on architectures which support it (which includes x86). Use irq_work_queue() + irq_work_sync() instead invoking the callback directly. Reported-by: Clark Williams <williams@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2024-12-06drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with NOTRACESebastian Andrzej Siewior1-1/+1
The order of the header files is important. If this header file is included after tracepoint.h was included then the NOTRACE here becomes a nop. Currently this happens for two .c files which use the tracepoitns behind DRM_I915_LOW_LEVEL_TRACEPOINTS. Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2024-12-06drm/i915: Disable tracing points on PREEMPT_RTSebastian Andrzej Siewior1-0/+4
Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 | CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10 | Call Trace: | rt_spin_lock+0x3f/0x50 | gen6_read32+0x45/0x1d0 [i915] | g4x_get_vblank_counter+0x36/0x40 [i915] | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915] The tracing events use trace_i915_pipe_update_start() among other events use functions acquire spinlock_t locks which are transformed into sleeping locks on PREEMPT_RT. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also might acquire a sleeping locks on PREEMPT_RT. At the time the arguments are evaluated within trace point, preemption is disabled and so the locks must not be acquired on PREEMPT_RT. Based on this I don't see any other way than disable trace points on PREMPT_RT. Reported-by: Luca Abeni <lucabe72@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06drm/i915: Don't check for atomic context on PREEMPT_RTSebastian Andrzej Siewior1-1/+1
The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT because the uncore::lock is a spinlock_t and does not disable preemption or interrupts. Changing the uncore:lock to a raw_spinlock_t doubles the worst case latency on an otherwise idle testbox during testing. Therefore I'm currently unsure about changing this. Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/ Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06drm/i915: Don't disable interrupts on PREEMPT_RT during atomic updatesMike Galbraith1-5/+10
Commit 8d7849db3eab7 ("drm/i915: Make sprite updates atomic") started disabling interrupts across atomic updates. This breaks on PREEMPT_RT because within this section the code attempt to acquire spinlock_t locks which are sleeping locks on PREEMPT_RT. According to the comment the interrupts are disabled to avoid random delays and not required for protection or synchronisation. If this needs to happen with disabled interrupts on PREEMPT_RT, and the whole section is restricted to register access then all sleeping locks need to be acquired before interrupts are disabled and some function maybe moved after enabling interrupts again. This includes: - prepare_to_wait() + finish_wait() due its wake queue. - drm_crtc_vblank_put() -> vblank_disable_fn() drm_device::vbl_lock. - skl_pfit_enable(), intel_update_plane(), vlv_atomic_update_fifo() and maybe others due to intel_uncore::lock - drm_crtc_arm_vblank_event() due to drm_device::event_lock and drm_device::vblank_time_lock. Don't disable interrupts on PREEMPT_RT during atomic updates. [bigeasy: drop local locks, commit message] Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06drm/i915: Use preempt_disable/enable_rt() where recommendedMike Galbraith1-2/+4
Mario Kleiner suggest in commit ad3543ede630f ("drm/intel: Push get_scanout_position() timestamping into kms driver.") a spots where preemption should be disabled on PREEMPT_RT. The difference is that on PREEMPT_RT the intel_uncore::lock disables neither preemption nor interrupts and so region remains preemptible. The area covers only register reads and writes. The part that worries me is: - __intel_get_crtc_scanline() the worst case is 100us if no match is found. - intel_crtc_scanlines_since_frame_timestamp() not sure how long this may take in the worst case. It was in the RT queue for a while and nobody complained. Disable preemption on PREEPMPT_RT during timestamping. [bigeasy: patch description.] Cc: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06printk: Avoid false positive lockdep report for legacy driver.John Ogness1-1/+8
printk may invoke the legacy console driver from atomic context. This leads to a lockdep splat because the console driver will acquire a sleeping lock and the caller may also hold a spinning lock. This is noticed by lockdep on !PREEMPT_RT configurations because it will also lead to a problem on PREEMPT_RT. On PREEMPT_RT the atomic path is always avoided and the console driver is always invoked from a dedicated thread. Thus the lockdep splat is a false positive. Override the lock-context before invoking the console driver. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06serial: 8250: revert "drop lockdep annotation from serial8250_clear_IER()"John Ogness1-0/+3
The 8250 driver no longer depends on @oops_in_progress and will no longer violate the port->lock locking constraints. This reverts commit 3d9e6f556e235ddcdc9f73600fdd46fe1736b090. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06printk: Add kthread for all legacy consolesJohn Ogness3-46/+210
The write callback of legacy consoles make use of spinlocks. This is not permitted with PREEMPT_RT in atomic contexts. Create a new kthread to handle printing of all the legacy consoles (and nbcon consoles if boot consoles are registered). Since the consoles are printing in a task context, it is no longer appropriate to support the legacy handover mechanism. These changes exist only for CONFIG_PREEMPT_RT. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06serial: 8250: Switch to nbcon consoleJohn Ogness3-3/+201
Implement the necessary callbacks to switch the 8250 console driver to perform as an nbcon console. Add implementations for the nbcon consoles (write_atomic, write_thread, driver_enter, driver_exit) and add CON_NBCON to the initial flags. The legacy code is kept in order to easily switch back to legacy mode by defining CONFIG_SERIAL_8250_LEGACY_CONSOLE. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06serial: core: Provide low-level functions to port lockJohn Ogness1-0/+12
The nbcon console's driver_enter() and driver_exit() callbacks need to lock the port lock in order to synchronize against other hardware activity (such as adjusting baud rates). However, they cannot use the uart_port_lock() wrappers because the printk subsystem will perform nbcon locking after calling the driver_enter() callback. Provide low-level variants __uart_port_lock_irqsave() and __uart_port_unlock_irqrestore() for this purpose. These are only to be used by the driver_enter()/driver_exit() callbacks. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06printk: nbcon: Provide function to reacquire ownershipJohn Ogness2-0/+34
Contexts may become nbcon owners for various reasons, not just for printing. Indeed, the port->lock wrapper takes ownership for anything relating to the hardware. Since ownership can be lost at any time due to handover or takeover, a context _should_ be prepared to back out immediately and carefully. However, there are many scenarios where the context _must_ reacquire ownership in order to finalize or revert hardware changes. One such example is when interrupts are disabled by a context. No other context will automagically re-enable the interrupts. For this case, the disabling context _must_ reacquire nbcon ownership so that it can re-enable the interrupts. Provide nbcon_reacquire() for exactly this purpose. Note that for printing contexts, after a successful reacquire the context will have no output buffer because that has been lost. nbcon_reacquire() cannot be used to resume printing. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06tty: sysfs: Add nbcon support for 'active'John Ogness1-2/+9
Allow the 'active' attribute to list nbcon consoles. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06proc: Add nbcon support for /proc/consolesJohn Ogness1-3/+11
Update /proc/consoles output to show 'W' if an nbcon write callback is implemented (write_atomic or write_thread). Also update /proc/consoles output to show 'N' if it is an nbcon console. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2024-12-06printk: nbcon: Start printing threadsJohn Ogness3-1/+33
If there are no boot consoles, the printing threads are started in early_initcall. If there are boot consoles, the printing threads are started after the last boot console has unregistered. The printing threads do not need to be concerned about boot consoles because boot consoles cannot register once a non-boot console has registered. Until a printing thread of a console has started, that console will print using atomic_write() in the printk() caller context. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>