Age | Commit message (Collapse) | Author | Files | Lines |
|
Enable RT-Linux ,set 1000 timer interrupts. And set gmac
to module.(Ethercat)
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
|
|
Fix return error in cpufeature.c
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
|
|
Signed-off-by: Clark Williams <clark.williams@gmail.com>
|
|
Console pointer in uart_port might be shared among multiple uart
ports. Flag port locked by nbcon should be saved in uart_port
structure instead of in console structure.
Fixes: 6424f396c49e ("printk: nbcon: Implement processing in port->lock wrapper")
Suggested-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/all/20240123054033.183114-2-junxiao.chang@intel.com
(cherry picked from commit d4fb86a96cb4a1efd24ca13a2ac234a1c9a3fdc5)
Signed-off-by: Clark Williams <clark.williams@gmail.com>
|
|
gup_pgd_range() is invoked with disabled interrupts and invokes
__kmap_local_page_prot() via pte_offset_map(), gup_p4d_range().
With HIGHPTE enabled, __kmap_local_page_prot() invokes kmap_high_get()
which uses a spinlock_t via lock_kmap_any(). This leads to an
sleeping-while-atomic error on PREEMPT_RT because spinlock_t becomes a
sleeping lock and must not be acquired in atomic context.
The loop in map_new_virtual() uses wait_queue_head_t for wake up which
also is using a spinlock_t.
Limit HAVE_FAST_GUP additionaly to remain disabled on PREEMPT_RT with
HIGHPTE enabled.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
(cherry picked from commit 02cf5a345530b4d3a94093f0b5c784701c2e7c6a)
Signed-off-by: Clark Williams <clark.williams@gmail.com>
|
|
Signed-off-by: Clark Williams <clark.williams@gmail.com>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This reverts commit cc3d27d9fdeddcb82db3ea176a44a5509e70eb1c.
This code was fixed in 6.6 stable so no need for it in the RT series
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Clark Williams <williams@redhat.com>
|
|
Drop the alloc_pages() call since the page is passed in as
a parameter and the alloced page will not be freed.
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Clark Williams <clark.williams@gmail.com>
|
|
Add a /sys/kernel entry to indicate that the kernel is a
realtime kernel.
Clark says that he needs this for udev rules, udev needs to evaluate
if its a PREEMPT_RT kernel a few thousand times and parsing uname
output is too slow or so.
Are there better solutions? Should it exist and return 0 on !-rt?
Signed-off-by: Clark Williams <williams@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Now, it's ready to enable RT on riscv.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
riscv has switched to GENERIC_ENTRY, so adding PREEMPT_AUTO is as simple
as adding TIF_ARCH_RESCHED_LAZY related definitions and enabling
HAVE_PREEMPT_AUTO.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Probing for misaligned access speed takes about 0.06 seconds. On a
system with 64 cores, doing this in smp_callin() means it's done
serially, extending boot time by 3.8 seconds. That's a lot of boot time.
Instead of measuring each CPU serially, let's do the measurements on
all CPUs in parallel. If we disable preemption on all CPUs, the
jiffies stop ticking, so we can do this in stages of 1) everybody
except core 0, then 2) core 0. The allocations are all done outside of
on_each_cpu() to avoid calling alloc_pages() with interrupts disabled.
For hotplugged CPUs that come in after the boot time measurement,
register CPU hotplug callbacks, and do the measurement there. Interrupts
are enabled in those callbacks, so they're fine to do alloc_pages() in.
[bigeasy: merge the individual patches into the final step.]
Reported-by: Jisheng Zhang <jszhang@kernel.org>
Closes: https://lore.kernel.org/all/mhng-9359993d-6872-4134-83ce-c97debe1cf9a@palmer-ri-x1c9/T/#mae9b8f40016f9df428829d33360144dc5026bcbf
Fixes: 584ea6564bca ("RISC-V: Probe for unaligned access speed")
Signed-off-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20231106225855.3121724-1-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Allow to select RT.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This is invoked from the secondary CPU in atomic context. On x86 we use
tsc instead. On Power we XOR it against mftb() so lets use stack address
as the initial value.
Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
While converting the openpic emulation code to use a raw_spinlock_t enables
guests to run on RT, there's still a performance issue. For interrupts sent in
directed delivery mode with a multiple CPU mask, the emulated openpic will loop
through all of the VCPUs, and for each VCPUs, it call IRQ_check, which will loop
through all the pending interrupts for that VCPU. This is done while holding the
raw_lock, meaning that in all this time the interrupts and preemption are
disabled on the host Linux. A malicious user app can max both these number and
cause a DoS.
This temporary fix is sent for two reasons. First is so that users who want to
use the in-kernel MPIC emulation are aware of the potential latencies, thus
making sure that the hardware MPIC and their usage scenario does not involve
interrupts sent in directed delivery mode, and the number of possible pending
interrupts is kept small. Secondly, this should incentivize the development of a
proper openpic emulation that would be better suited for RT.
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The RTAS work area allocator is using the generic memory allocator and
as such it must select it.
Select the generic memory allocator on pseries.
Fixes: 43033bc62d349 ("powerpc/pseries: add RTAS work area allocator")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/20230309135110.uAxhqRFk@linutronix.de
|
|
The locallock protects the per-CPU variable tce_page. The function
attempts to allocate memory while tce_page is protected (by disabling
interrupts).
Use local_irq_save() instead of local_irq_disable().
Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Add PREEMPT_RT to the backtrace if enabled.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Allow to select RT.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Allow to select RT.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
VFP_bounce() is invoked from within vfp_support_entry() and may send a
signal. Sending a signal uses spinlock_t which becomes a sleeping lock
on PREEMPT_RT and must not be acquired within a preempt-disabled
section.
Move the vfp_raise_sigfpe() block outside of the vfp_lock() section.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
vfp_entry() is invoked from exception handler and is fully preemptible.
It uses local_bh_disable() to remain uninterrupted while checking the
VFP state.
This is not working on PREEMPT_RT because local_bh_disable()
synchronizes the relevant section but the context remains fully
preemptible.
Use vfp_lock() for uninterrupted access.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
vfp_sync_hwstate() uses preempt_disable() followed by local_bh_disable()
to ensure that it won't get interrupted while checking the VFP state.
This harms PREEMPT_RT because softirq handling can get preempted and
local_bh_disable() synchronizes the related section with a sleeping lock
which does not work with disabled preemption.
Use the vfp_lock() to synchronize the access.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
kernel_neon_begin() uses local_bh_disable() to ensure exclusive access
to the VFP unit. This is broken on PREEMPT_RT because a BH disabled
section remains preemptible on PREEMPT_RT.
Introduce vfp_lock() which uses local_bh_disable() and preempt_disable()
on PREEMPT_RT. Since softirqs are processed always in thread context,
disabling preemption is enough to ensure that the current context won't
get interrupted by something that is using the VFP. Use it in
kernel_neon_begin().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
The lock is a sleeping lock and local_irq_save() is not the optimsation
we are looking for. Redo it to make it work on -RT and non-RT.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The lock is a sleeping lock and local_irq_save() is not the
optimsation we are looking for. Redo it to make it work on -RT and
non-RT.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Probably happens on all ARM, with
CONFIG_PREEMPT_RT
CONFIG_DEBUG_ATOMIC_SLEEP
This simple program....
int main() {
*((char*)0xc0001000) = 0;
};
[ 512.742724] BUG: sleeping function called from invalid context at kernel/rtmutex.c:658
[ 512.743000] in_atomic(): 0, irqs_disabled(): 128, pid: 994, name: a
[ 512.743217] INFO: lockdep is turned off.
[ 512.743360] irq event stamp: 0
[ 512.743482] hardirqs last enabled at (0): [< (null)>] (null)
[ 512.743714] hardirqs last disabled at (0): [<c0426370>] copy_process+0x3b0/0x11c0
[ 512.744013] softirqs last enabled at (0): [<c0426370>] copy_process+0x3b0/0x11c0
[ 512.744303] softirqs last disabled at (0): [< (null)>] (null)
[ 512.744631] [<c041872c>] (unwind_backtrace+0x0/0x104)
[ 512.745001] [<c09af0c4>] (dump_stack+0x20/0x24)
[ 512.745355] [<c0462490>] (__might_sleep+0x1dc/0x1e0)
[ 512.745717] [<c09b6770>] (rt_spin_lock+0x34/0x6c)
[ 512.746073] [<c0441bf0>] (do_force_sig_info+0x34/0xf0)
[ 512.746457] [<c0442668>] (force_sig_info+0x18/0x1c)
[ 512.746829] [<c041d880>] (__do_user_fault+0x9c/0xd8)
[ 512.747185] [<c041d938>] (do_bad_area+0x7c/0x94)
[ 512.747536] [<c041d990>] (do_sect_fault+0x40/0x48)
[ 512.747898] [<c040841c>] (do_DataAbort+0x40/0xa0)
[ 512.748181] Exception stack(0xecaa1fb0 to 0xecaa1ff8)
Oxc0000000 belongs to kernel address space, user task can not be
allowed to access it. For above condition, correct result is that
test case should receive a “segment fault” and exits but not stacks.
the root cause is commit 02fe2845d6a8 ("avoid enabling interrupts in
prefetch/data abort handlers"),it deletes irq enable block in Data
abort assemble code and move them into page/breakpiont/alignment fault
handlers instead. But author does not enable irq in translation/section
permission fault handlers. ARM disables irq when it enters exception/
interrupt mode, if kernel doesn't enable irq, it would be still disabled
during translation/section permission fault.
We see the above splat because do_force_sig_info is still called with
IRQs off, and that code eventually does a:
spin_lock_irqsave(&t->sighand->siglock, flags);
As this is architecture independent code, and we've not seen any other
need for other arch to have the siglock converted to raw lock, we can
conclude that we should enable irq for ARM translation/section
permission exception.
Signed-off-by: Yadi.hu <yadi.hu@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
jump-labels are used to efficiently switch between two possible code
paths. To achieve this, stop_machine() is used to keep the CPU in a
known state while the opcode is modified. The usage of stop_machine()
here leads to large latency spikes which can be observed on PREEMPT_RT.
Jump labels may change the target during runtime and are not restricted
to debug or "configuration/ setup" part of a PREEMPT_RT system where
high latencies could be defined as acceptable.
Disable jump-label support on a PREEMPT_RT system.
[bigeasy: Patch description.]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lkml.kernel.org/r/20220613182447.112191-2-bigeasy@linutronix.de
|
|
On Fri, Sep 22 2023 at 00:55, Thomas Gleixner wrote:
> On Thu, Sep 21 2023 at 09:00, Linus Torvalds wrote:
>> That said - I think as a proof of concept and "look, with this we get
>> the expected scheduling event counts", that patch is perfect. I think
>> you more than proved the concept.
>
> There is certainly quite some analyis work to do to make this a one to
> one replacement.
>
> With a handful of benchmarks the PoC (tweaked with some obvious fixes)
> is pretty much on par with the current mainline variants (NONE/FULL),
> but the memtier benchmark makes a massive dent.
>
> It sports a whopping 10% regression with the LAZY mode versus the mainline
> NONE model. Non-LAZY and FULL behave unsurprisingly in the same way.
>
> That benchmark is really sensitive to the preemption model. With current
> mainline (DYNAMIC_PREEMPT enabled) the preempt=FULL model has ~20%
> performance drop versus preempt=NONE.
That 20% was a tired pilot error. The real number is in the 5% ballpark.
> I have no clue what's going on there yet, but that shows that there is
> obviously quite some work ahead to get this sorted.
It took some head scratching to figure that out. The initial fix broke
the handling of the hog issue, i.e. the problem that Ankur tried to
solve, but I hacked up a "solution" for that too.
With that the memtier benchmark is roughly back to the mainline numbers,
but my throughput benchmark know how is pretty close to zero, so that
should be looked at by people who actually understand these things.
Likewise the hog prevention is just at the PoC level and clearly beyond
my knowledge of scheduler details: It unconditionally forces a
reschedule when the looping task is not responding to a lazy reschedule
request before the next tick. IOW it forces a reschedule on the second
tick, which is obviously different from the cond_resched()/might_sleep()
behaviour.
The changes vs. the original PoC aside of the bug and thinko fixes:
1) A hack to utilize the TRACE_FLAG_IRQS_NOSUPPORT flag to trace the
lazy preempt bit as the trace_entry::flags field is full already.
That obviously breaks the tracer ABI, but if we go there then
this needs to be fixed. Steven?
2) debugfs file to validate that loops can be force preempted w/o
cond_resched()
The usage is:
# taskset -c 1 bash
# echo 1 > /sys/kernel/debug/sched/hog &
# echo 1 > /sys/kernel/debug/sched/hog &
# echo 1 > /sys/kernel/debug/sched/hog &
top shows ~33% CPU for each of the hogs and tracing confirms that
the crude hack in the scheduler tick works:
bash-4559 [001] dlh2. 2253.331202: resched_curr <-__update_curr
bash-4560 [001] dlh2. 2253.340199: resched_curr <-__update_curr
bash-4561 [001] dlh2. 2253.346199: resched_curr <-__update_curr
bash-4559 [001] dlh2. 2253.353199: resched_curr <-__update_curr
bash-4561 [001] dlh2. 2253.358199: resched_curr <-__update_curr
bash-4560 [001] dlh2. 2253.370202: resched_curr <-__update_curr
bash-4559 [001] dlh2. 2253.378198: resched_curr <-__update_curr
bash-4561 [001] dlh2. 2253.389199: resched_curr <-__update_curr
The 'l' instead of the usual 'N' reflects that the lazy resched
bit is set. That makes __update_curr() invoke resched_curr()
instead of the lazy variant. resched_curr() sets TIF_NEED_RESCHED
and folds it into preempt_count so that preemption happens at the
next possible point, i.e. either in return from interrupt or at
the next preempt_enable().
That's as much as I wanted to demonstrate and I'm not going to spend
more cycles on it as I have already too many other things on flight and
the resulting scheduler woes are clearly outside of my expertice.
Though definitely I'm putting a permanent NAK in place for any attempts
to duct tape the preempt=NONE model any further by sprinkling more
cond*() and whatever warts around.
Thanks,
tglx
[tglx: s@CONFIG_PREEMPT_AUTO@CONFIG_PREEMPT_BUILD_AUTO@ ]
Link: https://lore.kernel.org/all/87jzshhexi.ffs@tglx/
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Once the known issues are addressed, it should be safe to enable the
driver.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
intel_guc_send_busy_loop() looks at in_atomic() and irqs_disabled() to
decide if it should busy-spin while waiting or if it may sleep.
Both checks will report false on PREEMPT_RT if sleeping spinlocks are
acquired leading to RCU splats while the function sleeps.
Check also if RCU has been disabled.
Reported-by: "John B. Wyatt IV" <jwyatt@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Commit ade8a0f59844 ("drm/i915: Make all GPU resets atomic") added a
preempt disable section over the hardware reset callback to prepare the
driver for being able to reset from atomic contexts.
In retrospect I can see that the work item at a time was about removing
the struct mutex from the reset path. Code base also briefly entertained
the idea of doing the reset under stop_machine in order to serialize
userspace mmap and temporary glitch in the fence registers (see
eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex"),
but that never materialized and was soon removed in 2caffbf11762
("drm/i915: Revoke mmaps and prevent access to fence registers across
reset") and replaced with a SRCU based solution.
As such, as far as I can see, today we still have a requirement that
resets must not sleep (invoked from submission tasklets), but no need to
support invoking them from a truly atomic context.
Given that the preemption section is problematic on RT kernels, since the
uncore lock becomes a sleeping lock and so is invalid in such section,
lets try and remove it. Potential downside is that our short waits on GPU
to complete the reset may get extended if CPU scheduling interferes, but
in practice that probably isn't a deal breaker.
In terms of mechanics, since the preemption disabled block is being
removed we just need to replace a few of the wait_for_atomic macros into
busy looping versions which will work (and not complain) when called from
non-atomic sections.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20230705093025.3689748-1-tvrtko.ursulin@linux.intel.com
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
The !irqs_disabled() check triggers on PREEMPT_RT even with
i915_sched_engine::lock acquired. The reason is the lock is transformed
into a sleeping lock on PREEMPT_RT and does not disable interrupts.
There is no need to check for disabled interrupts. The lockdep
annotation below already check if the lock has been acquired by the
caller and will yell if the interrupts are not disabled.
Remove the !irqs_disabled() check.
Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
execlists_dequeue() is invoked from a function which uses
local_irq_disable() to disable interrupts so the spin_lock() behaves
like spin_lock_irq().
This breaks PREEMPT_RT because local_irq_disable() + spin_lock() is not
the same as spin_lock_irq().
execlists_dequeue_irq() and execlists_dequeue() has each one caller
only. If intel_engine_cs::active::lock is acquired and released with the
_irq suffix then it behaves almost as if execlists_dequeue() would be
invoked with disabled interrupts. The difference is the last part of the
function which is then invoked with enabled interrupts.
I can't tell if this makes a difference. From looking at it, it might
work to move the last unlock at the end of the function as I didn't find
anything that would acquire the lock again.
Reported-by: Clark Williams <williams@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
Disabling interrupts and invoking the irq_work function directly breaks
on PREEMPT_RT.
PREEMPT_RT does not invoke all irq_work from hardirq context because
some of the user have spinlock_t locking in the callback function.
These locks are then turned into a sleeping locks which can not be
acquired with disabled interrupts.
Using irq_work_queue() has the benefit that the irqwork will be invoked
in the regular context. In general there is "no" delay between enqueuing
the callback and its invocation because the interrupt is raised right
away on architectures which support it (which includes x86).
Use irq_work_queue() + irq_work_sync() instead invoking the callback
directly.
Reported-by: Clark Williams <williams@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
The order of the header files is important. If this header file is
included after tracepoint.h was included then the NOTRACE here becomes a
nop. Currently this happens for two .c files which use the tracepoitns
behind DRM_I915_LOW_LEVEL_TRACEPOINTS.
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Luca Abeni reported this:
| BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003
| CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10
| Call Trace:
| rt_spin_lock+0x3f/0x50
| gen6_read32+0x45/0x1d0 [i915]
| g4x_get_vblank_counter+0x36/0x40 [i915]
| trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915]
The tracing events use trace_i915_pipe_update_start() among other events
use functions acquire spinlock_t locks which are transformed into
sleeping locks on PREEMPT_RT. A few trace points use
intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also
might acquire a sleeping locks on PREEMPT_RT.
At the time the arguments are evaluated within trace point, preemption
is disabled and so the locks must not be acquired on PREEMPT_RT.
Based on this I don't see any other way than disable trace points on
PREMPT_RT.
Reported-by: Luca Abeni <lucabe72@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT
because the uncore::lock is a spinlock_t and does not disable
preemption or interrupts.
Changing the uncore:lock to a raw_spinlock_t doubles the worst case
latency on an otherwise idle testbox during testing. Therefore I'm
currently unsure about changing this.
Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Commit
8d7849db3eab7 ("drm/i915: Make sprite updates atomic")
started disabling interrupts across atomic updates. This breaks on PREEMPT_RT
because within this section the code attempt to acquire spinlock_t locks which
are sleeping locks on PREEMPT_RT.
According to the comment the interrupts are disabled to avoid random delays and
not required for protection or synchronisation.
If this needs to happen with disabled interrupts on PREEMPT_RT, and the
whole section is restricted to register access then all sleeping locks
need to be acquired before interrupts are disabled and some function
maybe moved after enabling interrupts again.
This includes:
- prepare_to_wait() + finish_wait() due its wake queue.
- drm_crtc_vblank_put() -> vblank_disable_fn() drm_device::vbl_lock.
- skl_pfit_enable(), intel_update_plane(), vlv_atomic_update_fifo() and
maybe others due to intel_uncore::lock
- drm_crtc_arm_vblank_event() due to drm_device::event_lock and
drm_device::vblank_time_lock.
Don't disable interrupts on PREEMPT_RT during atomic updates.
[bigeasy: drop local locks, commit message]
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Mario Kleiner suggest in commit
ad3543ede630f ("drm/intel: Push get_scanout_position() timestamping into kms driver.")
a spots where preemption should be disabled on PREEMPT_RT. The
difference is that on PREEMPT_RT the intel_uncore::lock disables neither
preemption nor interrupts and so region remains preemptible.
The area covers only register reads and writes. The part that worries me
is:
- __intel_get_crtc_scanline() the worst case is 100us if no match is
found.
- intel_crtc_scanlines_since_frame_timestamp() not sure how long this
may take in the worst case.
It was in the RT queue for a while and nobody complained.
Disable preemption on PREEPMPT_RT during timestamping.
[bigeasy: patch description.]
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
printk may invoke the legacy console driver from atomic context. This leads to
a lockdep splat because the console driver will acquire a sleeping lock and the
caller may also hold a spinning lock. This is noticed by lockdep on !PREEMPT_RT
configurations because it will also lead to a problem on PREEMPT_RT.
On PREEMPT_RT the atomic path is always avoided and the console driver is
always invoked from a dedicated thread. Thus the lockdep splat is a false
positive.
Override the lock-context before invoking the console driver.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
The 8250 driver no longer depends on @oops_in_progress and
will no longer violate the port->lock locking constraints.
This reverts commit 3d9e6f556e235ddcdc9f73600fdd46fe1736b090.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
The write callback of legacy consoles make use of spinlocks.
This is not permitted with PREEMPT_RT in atomic contexts.
Create a new kthread to handle printing of all the legacy
consoles (and nbcon consoles if boot consoles are registered).
Since the consoles are printing in a task context, it is no
longer appropriate to support the legacy handover mechanism.
These changes exist only for CONFIG_PREEMPT_RT.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Implement the necessary callbacks to switch the 8250 console driver
to perform as an nbcon console.
Add implementations for the nbcon consoles (write_atomic, write_thread,
driver_enter, driver_exit) and add CON_NBCON to the initial flags.
The legacy code is kept in order to easily switch back to legacy mode
by defining CONFIG_SERIAL_8250_LEGACY_CONSOLE.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
The nbcon console's driver_enter() and driver_exit() callbacks need
to lock the port lock in order to synchronize against other hardware
activity (such as adjusting baud rates). However, they cannot use
the uart_port_lock() wrappers because the printk subsystem will
perform nbcon locking after calling the driver_enter() callback.
Provide low-level variants __uart_port_lock_irqsave() and
__uart_port_unlock_irqrestore() for this purpose. These are only
to be used by the driver_enter()/driver_exit() callbacks.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Contexts may become nbcon owners for various reasons, not just
for printing. Indeed, the port->lock wrapper takes ownership
for anything relating to the hardware.
Since ownership can be lost at any time due to handover or
takeover, a context _should_ be prepared to back out
immediately and carefully. However, there are many scenarios
where the context _must_ reacquire ownership in order to
finalize or revert hardware changes.
One such example is when interrupts are disabled by a context.
No other context will automagically re-enable the interrupts.
For this case, the disabling context _must_ reacquire nbcon
ownership so that it can re-enable the interrupts.
Provide nbcon_reacquire() for exactly this purpose.
Note that for printing contexts, after a successful reacquire
the context will have no output buffer because that has been
lost. nbcon_reacquire() cannot be used to resume printing.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Allow the 'active' attribute to list nbcon consoles.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Update /proc/consoles output to show 'W' if an nbcon write
callback is implemented (write_atomic or write_thread).
Also update /proc/consoles output to show 'N' if it is an
nbcon console.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
If there are no boot consoles, the printing threads are started
in early_initcall.
If there are boot consoles, the printing threads are started
after the last boot console has unregistered. The printing
threads do not need to be concerned about boot consoles because
boot consoles cannot register once a non-boot console has
registered.
Until a printing thread of a console has started, that console
will print using atomic_write() in the printk() caller context.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|