summaryrefslogtreecommitdiff
path: root/kernel/time
AgeCommit message (Collapse)AuthorFilesLines
2019-08-28posix-cpu-timers: Move state tracking to struct posix_cputimersThomas Gleixner1-33/+40
Put it where it belongs and clean up the ifdeffery in fork completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821192922.743229404@linutronix.de
2019-08-28posix-cpu-timers: Deduplicate rlimit handlingThomas Gleixner1-42/+25
Both thread and process expiry functions have the same functionality for sending signals for soft and hard RLIMITs duplicated in 4 different ways. Split it out into a common function and cleanup the callsites. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192922.653276779@linutronix.de
2019-08-28posix-cpu-timers: Remove pointless comparisonsThomas Gleixner1-9/+7
The soft RLIMIT expiry code checks whether the soft limit is greater than the hard limit. That's pointless because if the soft RLIMIT is greater than the hard RLIMIT then that code cannot be reached as the hard RLIMIT check is before that and already killed the process. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192922.548747613@linutronix.de
2019-08-28posix-cpu-timers: Get rid of 64bit divisionsThomas Gleixner1-10/+14
Instead of dividing A to match the units of B it's more efficient to multiply B to match the units of A. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192922.458286860@linutronix.de
2019-08-28posix-cpu-timers: Consolidate timer expiry furtherThomas Gleixner1-33/+30
With the array based samples and expiry cache, the expiry function can use a loop to collect timers from the clock specific lists. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192922.365469982@linutronix.de
2019-08-28posix-cpu-timers: Get rid of zero checksThomas Gleixner1-23/+15
Deactivation of the expiry cache is done by setting all clock caches to 0. That requires to have a check for zero in all places which update the expiry cache: if (cache == 0 || new < cache) cache = new; Use U64_MAX as the deactivated value, which allows to remove the zero checks when updating the cache and reduces it to the obvious check: if (new < cache) cache = new; This also removes the weird workaround in do_prlimit() which was required to convert a RLIMIT_CPU value of 0 (immediate expiry) to 1 because handing in 0 to the posix CPU timer code would have effectively disarmed it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192922.275086128@linutronix.de
2019-08-28posix-cpu-timers: Respect INFINITY for hard RTTIME limitThomas Gleixner1-1/+1
The RTIME limit expiry code does not check the hard RTTIME limit for INFINITY, i.e. being disabled. Add it. While this could be considered an ABI breakage if something would depend on this behaviour. Though it's highly unlikely to have an effect because RLIM_INFINITY is at minimum INT_MAX and the RTTIME limit is in seconds, so the timer would fire after ~68 years. Adding this obvious correct limit check also allows further consolidation of that code and is a prerequisite for cleaning up the 0 based checks and the rlimit setter code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192922.078293002@linutronix.de
2019-08-28posix-cpu-timers: Switch thread group sampling to arrayThomas Gleixner2-67/+48
That allows more simplifications in various places. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.988426956@linutronix.de
2019-08-28posix-cpu-timers: Restructure expiry arrayThomas Gleixner1-49/+56
Now that the abused struct task_cputime is gone, it's more natural to bundle the expiry cache and the list head of each clock into a struct and have an array of those structs. Follow the hrtimer naming convention of 'bases' and rename the expiry cache to 'nextevt' and adapt all usage sites. Generates also better code .text size shrinks by 80 bytes. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908262021140.1939@nanos.tec.linutronix.de
2019-08-28posix-cpu-timers: Remove cputime_expiresThomas Gleixner1-10/+0
The last users of the magic struct cputime based expiry cache are gone. Remove the leftovers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.790209622@linutronix.de
2019-08-28posix-cpu-timers: Make expiry checks array basedThomas Gleixner1-49/+36
The expiry cache is an array indexed by clock ids. The new sample functions allow to retrieve a corresponding array of samples. Convert the fastpath expiry checks to make use of the new sample functions and do the comparisons on the sample and the expiry array. Make the check for the expiry array being zero array based as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.695481430@linutronix.de
2019-08-28posix-cpu-timers: Provide array based sample functionsThomas Gleixner1-0/+26
Instead of using task_cputime and doing the addition of utime and stime at all call sites, it's way simpler to have a sample array which allows indexed based checks against the expiry cache array. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.590362974@linutronix.de
2019-08-28posix-cpu-timers: Switch check_*_timers() to array cacheThomas Gleixner1-15/+11
Use the array based expiry cache in check_thread_timers() and convert the store in check_process_timers() for consistency. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.408222378@linutronix.de
2019-08-28posix-cpu-timers: Simplify set_process_cpu_timer()Thomas Gleixner1-16/+8
The expiry cache can now be accessed as an array. Replace the per clock checks with a simple comparison of the clock indexed array member. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.303316423@linutronix.de
2019-08-28posix-cpu-timers: Simplify timer queueingThomas Gleixner1-34/+21
Now that the expiry cache can be accessed as an array, the per clock checking can be reduced to just comparing the corresponding array elements. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.212129449@linutronix.de
2019-08-28posix-cpu-timers: Provide array based access to expiry cacheThomas Gleixner1-1/+11
Using struct task_cputime for the expiry cache is a pretty odd choice and comes with magic defines to rename the fields for usage in the expiry cache. struct task_cputime is basically a u64 array with 3 members, but it has distinct members. The expiry cache content is different than the content of task_cputime because expiry[PROF] = task_cputime.stime + task_cputime.utime expiry[VIRT] = task_cputime.utime expiry[SCHED] = task_cputime.sum_exec_runtime So there is no direct mapping between task_cputime and the expiry cache and the #define based remapping is just a horrible hack. Having the expiry cache array based allows further simplification of the expiry code. To avoid an all in one cleanup which is hard to review add a temporary anonymous union into struct task_cputime which allows array based access to it. That requires to reorder the members. Add a build time sanity check to validate that the members are at the same place. The union and the build time checks will be removed after conversion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.105793824@linutronix.de
2019-08-28posix-cpu-timers: Move expiry cache into struct posix_cputimersThomas Gleixner1-18/+27
The expiry cache belongs into the posix_cputimers container where the other cpu timers information is. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192921.014444012@linutronix.de
2019-08-28posix-cpu-timers: Create a container structThomas Gleixner1-10/+10
Per task/process data of posix CPU timers is all over the place which makes the code hard to follow and requires ifdeffery. Create a container to hold all this information in one place, so data is consolidated and the ifdeffery can be confined to the posix timer header file and removed from places like fork. As a first step, move the cpu_timers list head array into the new struct and clean up the initializers and simplify fork. The remaining #ifdef in fork will be removed later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.819418976@linutronix.de
2019-08-28posix-cpu-timers: Move prof/virt_ticks into callerThomas Gleixner1-21/+9
The functions have only one caller left. No point in having them. Move the almost duplicated code into the caller and simplify it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.729298382@linutronix.de
2019-08-28posix-cpu-timers: Sample task times once in expiry checkThomas Gleixner1-4/+6
Sampling the task times twice does not make sense. Do it once. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.639878168@linutronix.de
2019-08-28posix-cpu-timers: Get rid of pointer indirectionThomas Gleixner1-28/+22
Now that the sample functions have no return value anymore, the result can simply be returned instead of using pointer indirection. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.535079278@linutronix.de
2019-08-28posix-cpu-timers: Simplify sample functionsThomas Gleixner1-15/+13
All callers hand in a valdiated clock id. Remove the return value which was unchecked in most places anyway. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.430475832@linutronix.de
2019-08-28posix-cpu-timers: Remove pointless return value checkThomas Gleixner1-3/+2
set_process_cpu_timer() checks already whether the clock id is valid. No point in checking the return value of the sample function. That allows to simplify the sample function later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.339725769@linutronix.de
2019-08-28posix-cpu-timers: Use clock ID in posix_cpu_timer_rearm()Thomas Gleixner1-2/+3
Extract the clock ID (PROF/VIRT/SCHED) from the clock selector and use it as argument to the sample functions. That allows to simplify them once all callers are fixed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.245357769@linutronix.de
2019-08-28posix-cpu-timers: Use clock ID in posix_cpu_timer_get()Thomas Gleixner1-2/+3
Extract the clock ID (PROF/VIRT/SCHED) from the clock selector and use it as argument to the sample functions. That allows to simplify them once all callers are fixed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.155487201@linutronix.de
2019-08-28posix-cpu-timers: Use clock ID in posix_cpu_timer_set()Thomas Gleixner1-5/+6
Extract the clock ID (PROF/VIRT/SCHED) from the clock selector and use it as argument to the sample functions. That allows to simplify them once all callers are fixed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192920.050770464@linutronix.de
2019-08-28posix-cpu-timers: Consolidate thread group sample codeThomas Gleixner1-39/+20
cpu_clock_sample_group() and cpu_timer_sample_group() are almost the same. Before the rename one called thread_group_cputimer() and the other thread_group_cputime(). Really intuitive function names. Consolidate the functions and also avoid the thread traversal when the thread group's accounting is already active. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.960966884@linutronix.de
2019-08-28posix-cpu-timers: Rename thread_group_cputimer() and make it staticThomas Gleixner1-2/+15
thread_group_cputimer() is a complete misnomer. The function does two things: - For arming process wide timers it makes sure that the atomic time storage is up to date. If no cpu timer is armed yet, then the atomic time storage is not updated by the scheduler for performance reasons. In that case a full summing up of all threads needs to be done and the update needs to be enabled. - Samples the current time into the caller supplied storage. Rename it to thread_group_start_cputime(), make it static and fixup the callsite. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.869350319@linutronix.de
2019-08-28posix-cpu-timers: Sample directly in timer checkThomas Gleixner1-3/+4
The thread group accounting is active, otherwise the expiry function would not be running. Sample the thread group time directly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.780348088@linutronix.de
2019-08-28itimers: Use quick sample functionThomas Gleixner1-1/+1
get_itimer() locks sighand lock and checks whether the timer is already expired. If it is not expired then the thread group cputime accounting is already enabled. Use the sampling function not the one which is meant for starting a timer. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.689713638@linutronix.de
2019-08-28posix-cpu-timers: Provide quick sample function for itimerThomas Gleixner1-0/+21
get_itimer() needs a sample of the current thread group cputime. It invokes thread_group_cputimer() - which is a misnomer. That function also starts eventually the group cputime accouting which is bogus because the accounting is already active when a timer is armed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.599658199@linutronix.de
2019-08-28posix-cpu-timers: Use common permission check in posix_cpu_timer_create()Thomas Gleixner1-32/+3
Yet another copy of the same thing gone... Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.505833418@linutronix.de
2019-08-28posix-cpu-timers: Use common permission check in posix_cpu_clock_get()Thomas Gleixner1-43/+14
Replace the next slightly different copy of permission checks. That also removes the necessarity to check the return value of the sample functions because the clock id is already validated. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190821192919.414813172@linutronix.de
2019-08-28posix-cpu-timers: Provide task validation functionsThomas Gleixner1-21/+44
The code contains three slightly different copies of validating whether a given clock resolves to a valid task and whether the current caller has permissions to access it. Create central functions. Replace check_clock() as a first step and rename it to something sensible. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821192919.326097175@linutronix.de
2019-08-23timekeeping/vsyscall: Prevent math overflow in BOOTTIME updateThomas Gleixner2-9/+18
The VDSO update for CLOCK_BOOTTIME has a overflow issue as it shifts the nanoseconds based boot time offset left by the clocksource shift. That overflows once the boot time offset becomes large enough. As a consequence CLOCK_BOOTTIME in the VDSO becomes a random number causing applications to misbehave. Fix it by storing a timespec64 representation of the offset when boot time is adjusted and add that to the MONOTONIC base time value in the vdso data page. Using the timespec64 representation avoids a 64bit division in the update code. Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation") Reported-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Chris Clayton <chris2553@googlemail.com> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908221257580.1983@nanos.tec.linutronix.de
2019-08-21posix-cpu-timers: Remove tsk argument from run_posix_cpu_timers()Thomas Gleixner2-3/+4
It's always current. Don't give people wrong ideas. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190819143801.945469967@linutronix.de
2019-08-21posix-cpu-timers: Sanitize bogus WARNONSThomas Gleixner1-7/+13
Warning when p == NULL and then proceeding and dereferencing p does not make any sense as the kernel will crash with a NULL pointer dereference right away. Bailing out when p == NULL and returning an error code does not cure the underlying problem which caused p to be NULL. Though it might allow to do proper debugging. Same applies to the clock id check in set_process_cpu_timer(). Clean them up and make them return without trying to do further damage. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190819143801.846497772@linutronix.de
2019-08-21hrtimer: Don't take expiry_lock when timer is currently migratedJulien Grall1-1/+5
migration_base is used as a placeholder when an hrtimer is migrated to a different CPU. In the case that hrtimer_cancel_wait_running() hits a timer which is currently migrated it would pointlessly acquire the expiry lock of the migration base, which is even not initialized. Surely it could be initialized, but there is absolutely no point in acquiring this lock because the timer is guaranteed not to run it's callback for which the caller waits to finish on that base. So it would just do the inc/lock/dec/unlock dance for nothing. As the base switch is short and non-preemptible, there is no issue when the wait function returns immediately. The timer base and base->cpu_base cannot be NULL in the code path which is invoking that, so just replace those checks with a check whether base is migration base. [ tglx: Updated from RT patch. Massaged changelog. Added comment. ] Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821092409.13225-4-julien.grall@arm.com
2019-08-21hrtimer: Protect lockless access to timer->baseJulien Grall1-1/+2
The update to timer->base is protected by the base->cpu_base->lock(). However, hrtimer_cancel_wait_running() does access it lockless. So the compiler is allowed to refetch timer->base which can cause havoc when the timer base is changed concurrently. Use READ_ONCE() to prevent this. [ tglx: Adapted from a RT patch ] Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821092409.13225-2-julien.grall@arm.com
2019-08-21PM / wakeup: Show wakeup sources stats in sysfsTri Vo1-1/+1
Add an ID and a device pointer to 'struct wakeup_source'. Use them to to expose wakeup sources statistics in sysfs under /sys/class/wakeup/wakeup<ID>/*. Co-developed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Co-developed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Tri Vo <trong@android.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-08-20posix-cpu-timers: Fixup stale commentThomas Gleixner1-3/+4
The comment above cleanup_timers() is outdated. The timers are only removed from the task/process list heads but not modified in any other way. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190819143801.747233612@linutronix.de
2019-08-20hrtimer: Improve comments on handling priority inversion against softirq kthreadFrederic Weisbecker2-4/+16
The handling of a priority inversion between timer cancelling and a a not well defined possible preemption of softirq kthread is not very clear. Especially in the posix timers side it's unclear why there is a specific RT wait callback. All the nice explanations can be found in the initial changelog of f61eff83cec9 (hrtimer: Prepare support for PREEMPT_RT"). Extract the detailed informations from there and put it into comments. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190820132656.GC2093@lenoir
2019-08-20posix-timers: Use a callback for cancel synchronization on PREEMPT_RTThomas Gleixner3-1/+32
Posix timer delete retry loops are affected by the same priority inversion and live lock issues as the other timers. Provide a RT specific synchronization function which keeps a reference to the timer by holding rcu read lock to prevent the timer from being freed, dropping the timer lock and invoking the timer specific wait function via a new callback. This does not yet cover posix CPU timers because they need more special treatment on PREEMPT_RT. [ This is folded into the original attempt which did not use a callback. ] Originally-by: Anna-Maria Gleixenr <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20190819143801.656864506@linutronix.de
2019-08-01posix-timers: Move rcu_head out of it unionSebastian Andrzej Siewior1-2/+2
Timer deletion on PREEMPT_RT is prone to priority inversion and live locks. The hrtimer code has a synchronization mechanism for this. Posix CPU timers will grow one. But that mechanism cannot be invoked while holding the k_itimer lock because that can deadlock against the running timer callback. So the lock must be dropped which allows the timer to be freed. The timer free can be prevented by taking RCU readlock before dropping the lock, but because the rcu_head is part of the 'it' union a concurrent free will overwrite the hrtimer on which the task is trying to synchronize. Move the rcu_head out of the union to prevent this. [ tglx: Fixed up kernel-doc. Rewrote changelog ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190730223828.965541887@linutronix.de
2019-08-01posix-timers: Rework cancel retry loopsThomas Gleixner1-6/+23
As a preparatory step for adding the PREEMPT RT specific synchronization mechanism to wait for a running timer callback, rework the timer cancel retry loops so they call a common function. This allows trivial substitution in one place. Originally-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190730223828.874901027@linutronix.de
2019-08-01posix-timers: Cleanup the flag/flags confusionThomas Gleixner1-5/+5
do_timer_settime() has a 'flags' argument and uses 'flag' for the interrupt flags, which is confusing at best. Rename the argument so 'flags' can be used for interrupt flags as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190730223828.782664411@linutronix.de
2019-08-01itimers: Prepare for PREEMPT_RTAnna-Maria Gleixner1-0/+1
Use the hrtimer_cancel_wait_running() synchronization mechanism to prevent priority inversion and live locks on PREEMPT_RT. As a benefit the retry loop gains the missing cpu_relax() on !RT. [ tglx: Split out of combo patch ] Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190730223828.690771827@linutronix.de
2019-08-01alarmtimer: Prepare for PREEMPT_RTAnna-Maria Gleixner1-1/+1
Use the hrtimer_cancel_wait_running() synchronization mechanism to prevent priority inversion and live locks on PREEMPT_RT. [ tglx: Split out of combo patch ] Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190730223828.508744705@linutronix.de
2019-08-01timers: Prepare support for PREEMPT_RTAnna-Maria Gleixner1-8/+95
When PREEMPT_RT is enabled, the soft interrupt thread can be preempted. If the soft interrupt thread is preempted in the middle of a timer callback, then calling del_timer_sync() can lead to two issues: - If the caller is on a remote CPU then it has to spin wait for the timer handler to complete. This can result in unbound priority inversion. - If the caller originates from the task which preempted the timer handler on the same CPU, then spin waiting for the timer handler to complete is never going to end. To avoid these issues, add a new lock to the timer base which is held around the execution of the timer callbacks. If del_timer_sync() detects that the timer callback is currently running, it blocks on the expiry lock. When the callback is finished, the expiry lock is dropped by the softirq thread which wakes up the waiter and the system makes progress. This addresses both the priority inversion and the life lock issues. This mechanism is not used for timers which are marked IRQSAFE as for those preemption is disabled accross the callback and therefore this situation cannot happen. The callbacks for such timers need to be individually audited for RT compliance. The same issue can happen in virtual machines when the vCPU which runs a timer callback is scheduled out. If a second vCPU of the same guest calls del_timer_sync() it will spin wait for the other vCPU to be scheduled back in. The expiry lock mechanism would avoid that. It'd be trivial to enable this when paravirt spinlocks are enabled in a guest, but it's not clear whether this is an actual problem in the wild, so for now it's an RT only mechanism. As the softirq thread can be preempted with PREEMPT_RT=y, the SMP variant of del_timer_sync() needs to be used on UP as well. [ tglx: Refactored it for mainline ] Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190726185753.832418500@linutronix.de
2019-08-01hrtimer: Prepare support for PREEMPT_RTAnna-Maria Gleixner1-6/+89
When PREEMPT_RT is enabled, the soft interrupt thread can be preempted. If the soft interrupt thread is preempted in the middle of a timer callback, then calling hrtimer_cancel() can lead to two issues: - If the caller is on a remote CPU then it has to spin wait for the timer handler to complete. This can result in unbound priority inversion. - If the caller originates from the task which preempted the timer handler on the same CPU, then spin waiting for the timer handler to complete is never going to end. To avoid these issues, add a new lock to the timer base which is held around the execution of the timer callbacks. If hrtimer_cancel() detects that the timer callback is currently running, it blocks on the expiry lock. When the callback is finished, the expiry lock is dropped by the softirq thread which wakes up the waiter and the system makes progress. This addresses both the priority inversion and the life lock issues. The same issue can happen in virtual machines when the vCPU which runs a timer callback is scheduled out. If a second vCPU of the same guest calls hrtimer_cancel() it will spin wait for the other vCPU to be scheduled back in. The expiry lock mechanism would avoid that. It'd be trivial to enable this when paravirt spinlocks are enabled in a guest, but it's not clear whether this is an actual problem in the wild, so for now it's an RT only mechanism. [ tglx: Refactored it for mainline ] Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190726185753.737767218@linutronix.de