summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-06-28x86/hpet: Wrap legacy clockevent in hpet_channelThomas Gleixner1-22/+27
For HPET channel 0 there exist two clockevent structures right now: - the static hpet_clockevent - the clockevent in channel 0 storage The goal is to use the clockevent in the channel storage, remove the static variable and share code with the MSI implementation. As a first step wrap the legacy clockevent into a hpet_channel struct and convert the users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132436.368141247@linutronix.de
2019-06-28x86/hpet: Use cached info instead of extra flagsThomas Gleixner1-53/+23
Now that HPET clockevent support is integrated into the channel data, reuse the cached boot configuration instead of copying the same information into a flags field. This also allows to consolidate the reservation code into one place, which can now solely depend on the mode information. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132436.277510163@linutronix.de
2019-06-28x86/hpet: Move clockevents into channelsThomas Gleixner3-85/+64
Instead of allocating yet another data structure, move the clock event data into the channel structure. This allows further consolidation of the reservation code and the reuse of the cached boot config to replace the extra flags in the clockevent data. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132436.185851116@linutronix.de
2019-06-28x86/hpet: Rename variables to prepare for switching to channelsIngo Molnar1-62/+62
struct hpet_dev is gone with the next change as the clockevent storage moves into struct hpet_channel. So the variable name hdev will not make sense anymore. Ditto for timer vs. channel and similar details. Doing the rename in the change makes the patch harder to review. Doing it afterward is problematic vs. tracking down issues. Doing it upfront is the easiest solution as it does not change functionality. Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132436.093113681@linutronix.de
2019-06-28x86/hpet: Add function to select a /dev/hpet channelThomas Gleixner1-0/+17
If CONFIG_HPET=y is enabled the x86 specific HPET code should reserve at least one channel for the /dev/hpet character device, so that not all channels are absorbed for per CPU clockevent devices. Create a function to assign HPET_MODE_DEVICE so the rework of the clockevents allocation code can utilize the mode information instead of reducing the number of evaluated channels by #ifdef hackery. The function is not yet used, but provided as a separate patch for ease of review. It will be used when the rework of the clockevent selection takes place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132436.002758910@linutronix.de
2019-06-28x86/hpet: Add mode information to struct hpet_channelThomas Gleixner1-0/+11
The usage of the individual HPET channels is not tracked in a central place. The information is scattered in different data structures. Also the HPET reservation in the HPET character device is split out into several places which makes the code hard to follow. Assigning a mode to the channel allows to consolidate the reservation code and paves the way for further simplifications. As a first step set the mode of the legacy channels when the HPET is in legacy mode. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.911652981@linutronix.de
2019-06-28x86/hpet: Use cached channel dataThomas Gleixner1-25/+16
Instead of rereading the HPET registers over and over use the information which was cached in hpet_enable(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.821728550@linutronix.de
2019-06-28x86/hpet: Introduce struct hpet_base and struct hpet_channelThomas Gleixner1-34/+48
Introduce new data structures to replace the ad hoc collection of separate variables and pointers. Replace the boot configuration store and restore as a first step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.728456320@linutronix.de
2019-06-28x86/hpet: Coding style cleanupIngo Molnar1-17/+26
Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.637420368@linutronix.de
2019-06-28x86/hpet: Clean up commentsIngo Molnar1-18/+23
Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.545653922@linutronix.de
2019-06-28x86/hpet: Make naming consistentIngo Molnar1-10/+10
Use 'evt' for clockevents pointers and capitalize HPET in comments. Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.454138339@linutronix.de
2019-06-28x86/hpet: Remove not required includesIngo Molnar1-11/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.348089155@linutronix.de
2019-06-28x86/hpet: Decapitalize and rename EVT_TO_HPET_DEVThomas Gleixner1-17/+10
It's a function not a macro and the upcoming changes use channel for the individual hpet timer units to allow a step by step refactoring approach. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.241032433@linutronix.de
2019-06-28x86/hpet: Simplify counter validationThomas Gleixner1-6/+4
There is no point to loop for 200k TSC cycles to check afterwards whether the HPET counter is working. Read the counter inside of the loop and break out when the counter value changed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.149535103@linutronix.de
2019-06-28x86/hpet: Separate counter check out of clocksource register codeThomas Gleixner1-34/+31
The init code checks whether the HPET counter works late in the init function when the clocksource is registered. That should happen right with the other sanity checks. Split it into a separate validation function and move it to the other sanity checks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132435.058540608@linutronix.de
2019-06-28x86/hpet: Shuffle code around for readability sakeThomas Gleixner1-40/+41
It doesn't make sense to have init functions in the middle of other code. Aside of that, further changes in that area create horrible diffs if the code stays where it is. No functional change Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.951733064@linutronix.de
2019-06-28x86/hpet: Move static and global variables to one placeThomas Gleixner1-28/+22
Having static and global variables sprinkled all over the code is just annoying to read. Move them all to the top of the file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.860549134@linutronix.de
2019-06-28x86/hpet: Sanitize stub functionsThomas Gleixner1-9/+3
Mark them inline and remove the pointless 'return;' statement. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.754768274@linutronix.de
2019-06-28x86/hpet: Mark init functions __initThomas Gleixner1-3/+3
They are only called from init code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.645357869@linutronix.de
2019-06-28x86/hpet: Remove the unused hpet_msi_read() functionThomas Gleixner2-8/+0
No users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.553729327@linutronix.de
2019-06-28x86/hpet: Remove unused parameter from hpet_next_event()Thomas Gleixner1-5/+5
The clockevent device pointer is not used in this function. While at it, rename the misnamed 'timer' parameter to 'channel', which makes it clear what this parameter means. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.447880978@linutronix.de
2019-06-28x86/hpet: Remove pointless x86-64 specific #includeThomas Gleixner1-4/+0
Nothing requires asm/pgtable.h here anymore. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.339011567@linutronix.de
2019-06-28x86/hpet: Restructure init codeThomas Gleixner1-38/+43
As a preparatory change for further consolidation, restructure the HPET init code so it becomes more readable. Fix up misleading and stale comments and rename variables so they actually make sense. No intended functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.247842972@linutronix.de
2019-06-28x86/hpet: Replace printk(KERN...) with pr_...()Thomas Gleixner1-26/+19
And sanitize the format strings while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.140411339@linutronix.de
2019-06-28x86/hpet: Simplify CPU online codeThomas Gleixner1-29/+2
The indirection via work scheduled on the upcoming CPU was necessary with the old hotplug code because the online callback was invoked on the control CPU not on the upcoming CPU. The rework of the CPU hotplug core guarantees that the online callbacks are invoked on the upcoming CPU. Remove the now pointless work redirection. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Link: https://lkml.kernel.org/r/20190623132434.047254075@linutronix.de
2019-06-28x86/unwind/orc: Fall back to using frame pointers for generated codeJosh Poimboeuf1-4/+22
The ORC unwinder can't unwind through BPF JIT generated code because there are no ORC entries associated with the code. If an ORC entry isn't available, try to fall back to frame pointers. If BPF and other generated code always do frame pointer setup (even with CONFIG_FRAME_POINTERS=n) then this will allow ORC to unwind through most generated code despite there being no corresponding ORC entries. Fixes: d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER") Reported-by: Song Liu <songliubraving@fb.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kairui Song <kasong@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/b6f69208ddff4343d56b7bfac1fc7cfcd62689e8.1561595111.git.jpoimboe@redhat.com
2019-06-28perf/x86: Always store regs->ip in perf_callchain_kernel()Song Liu1-5/+5
The stacktrace_map_raw_tp BPF selftest is failing because the RIP saved by perf_arch_fetch_caller_regs() isn't getting saved by perf_callchain_kernel(). This was broken by the following commit: d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER") With that change, when starting with non-HW regs, the unwinder starts with the current stack frame and unwinds until it passes up the frame which called perf_arch_fetch_caller_regs(). So regs->ip needs to be saved deliberately. Fixes: d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER") Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kairui Song <kasong@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/3975a298fa52b506fea32666d8ff6a13467eee6d.1561595111.git.jpoimboe@redhat.com
2019-06-28selftests/x86: Add a test for process_vm_readv() on the vsyscall pageAndy Lutomirski1-0/+35
get_gate_page() is a piece of somewhat alarming code to make get_user_pages() work on the vsyscall page. Test it via process_vm_readv(). Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/0fe34229a9330e8f9de9765967939cc4f1cf26b1.1561610354.git.luto@kernel.org
2019-06-28x86/vsyscall: Add __ro_after_init to global variablesAndy Lutomirski1-2/+2
The vDSO is only configurable by command-line options, so make its global variables __ro_after_init. This seems highly unlikely to ever stop an exploit, but it's nicer anyway. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/a386925835e49d319e70c4d7404b1f6c3c2e3702.1561610354.git.luto@kernel.org
2019-06-28x86/vsyscall: Change the default vsyscall mode to xonlyAndy Lutomirski1-1/+1
The use case for full emulation over xonly is very esoteric, e.g. magic instrumentation tools. Change the default to the safer xonly mode. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/30539f8072d2376b9c9efcc07e6ed0d6bf20e882.1561610354.git.luto@kernel.org
2019-06-28selftests/x86/vsyscall: Verify that vsyscall=none blocks executionAndy Lutomirski1-24/+52
If vsyscall=none accidentally still allowed vsyscalls, the test wouldn't fail. Fix it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/b413397c804265f8865f3e70b14b09485ea7c314.1561610354.git.luto@kernel.org
2019-06-28x86/vsyscall: Document odd SIGSEGV error code for vsyscallsAndy Lutomirski2-1/+15
Even if vsyscall=none, user page faults on the vsyscall page are reported as though the PROT bit in the error code was set. Add a comment explaining why this is probably okay and display the value in the test case. While at it, explain why the behavior is correct with respect to PKRU. Modify also the selftest to print the odd error code so that there is a way to demonstrate the odd behaviour. If anyone really cares about more accurate emulation, the behaviour could be changed. But that needs a real good justification. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/75c91855fd850649ace162eec5495a1354221aaa.1561610354.git.luto@kernel.org
2019-06-28x86/vsyscall: Show something useful on a read faultAndy Lutomirski3-9/+27
Just segfaulting the application when it tries to read the vsyscall page in xonly mode is not helpful for those who need to debug it. Emit a hint. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Link: https://lkml.kernel.org/r/8016afffe0eab497be32017ad7f6f7030dc3ba66.1561610354.git.luto@kernel.org
2019-06-28x86/vsyscall: Add a new vsyscall=xonly modeAndy Lutomirski3-12/+44
With vsyscall emulation on, a readable vsyscall page is still exposed that contains syscall instructions that validly implement the vsyscalls. This is required because certain dynamic binary instrumentation tools attempt to read the call targets of call instructions in the instrumented code. If the instrumented code uses vsyscalls, then the vsyscall page needs to contain readable code. Unfortunately, leaving readable memory at a deterministic address can be used to help various ASLR bypasses, so some hardening value can be gained by disallowing vsyscall reads. Given how rarely the vsyscall page needs to be readable, add a mechanism to make the vsyscall page be execute only. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/d17655777c21bc09a7af1bbcf74e6f2b69a51152.1561610354.git.luto@kernel.org
2019-06-28Documentation/admin: Remove the vsyscall=native documentationAndy Lutomirski1-6/+0
The vsyscall=native feature is gone -- remove the docs. Fixes: 076ca272a14c ("x86/vsyscall/64: Drop "native" vsyscalls") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/d77c7105eb4c57c1a95a95b6a5b8ba194a18e764.1561610354.git.luto@kernel.org
2019-06-28x86/tls: Fix possible spectre-v1 in do_get_thread_area()Dianzhang Chen1-2/+7
The index to access the threads tls array is controlled by userspace via syscall: sys_ptrace(), hence leading to a potential exploitation of the Spectre variant 1 vulnerability. The index can be controlled from: ptrace -> arch_ptrace -> do_get_thread_area. Fix this by sanitizing the user supplied index before using it to access the p->thread.tls_array. Signed-off-by: Dianzhang Chen <dianzhangchen0@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1561524630-3642-1-git-send-email-dianzhangchen0@gmail.com
2019-06-28x86/ptrace: Fix possible spectre-v1 in ptrace_get_debugreg()Dianzhang Chen1-1/+4
The index to access the threads ptrace_bps is controlled by userspace via syscall: sys_ptrace(), hence leading to a potential exploitation of the Spectre variant 1 vulnerability. The index can be controlled from: ptrace -> arch_ptrace -> ptrace_get_debugreg. Fix this by sanitizing the user supplied index before using it access thread->ptrace_bps. Signed-off-by: Dianzhang Chen <dianzhangchen0@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1561476617-3759-1-git-send-email-dianzhangchen0@gmail.com
2019-06-28hrtimer: Use a bullet for the returns bullet listMauro Carvalho Chehab1-3/+4
That gets rid of this warning: ./kernel/time/hrtimer.c:1119: WARNING: Block quote ends without a blank line; unexpected unindent. and displays nicely both at the source code and at the produced documentation. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linux Doc Mailing List <linux-doc@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Link: https://lkml.kernel.org/r/74ddad7dac331b4e5ce4a90e15c8a49e3a16d2ac.1561372382.git.mchehab+samsung@kernel.org
2019-06-27ceph: fix ceph_mdsc_build_path to not stop on first componentJeff Layton1-1/+2
When ceph_mdsc_build_path is handed a positive dentry, it will return a zero-length path string with the base set to that dentry. This is not what we want. Always include at least one path component in the string. ceph_mdsc_build_path has behaved this way for a long time but it didn't matter until recent d_name handling rework. Fixes: 964fff7491e4 ("ceph: use ceph_mdsc_build_path instead of clone_dentry_name") Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2019-06-27perf: arm_spe: Enable ACPI/Platform automatic module loadingJeremy Linton1-2/+10
Lets add the MODULE_TABLE and platform id_table entries so that the SPE driver can attach to the ACPI platform device created by the core pmu code. Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27arm_pmu: acpi: spe: Add initial MADT/SPE probingJeremy Linton3-0/+77
ACPI 6.3 adds additional fields to the MADT GICC structure to describe SPE PPI's. We pick these out of the cached reference to the madt_gicc structure similarly to the core PMU code. We then create a platform device referring to the IRQ and let the user/module loader decide whether to load the SPE driver. Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27ACPI/PPTT: Add function to return ACPI 6.3 Identical tokensJeremy Linton2-0/+31
ACPI 6.3 adds a flag to indicate that child nodes are all identical cores. This is useful to authoritatively determine if a set of (possibly offline) cores are identical or not. Since the flag doesn't give us a unique id we can generate one and use it to create bitmaps of sibling nodes, or simply in a loop to determine if a subset of cores are identical. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27ACPI/PPTT: Modify node flag detection to find last IDENTICALJeremy Linton1-6/+29
The ACPI specification implies that the IDENTICAL flag should be set on all non leaf nodes where the children are identical. This means that we need to be searching for the last node with the identical flag set rather than the first one. Since this flag is also dependent on the table revision, we need to add a bit of extra code to verify the table revision, and the next node's state in the traversal. Since we want to avoid function pointers here, lets just special case the IDENTICAL flag. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27ARM: dts: armada-xp-98dx3236: Switch to armada-38x-uart serial nodeJoshua Scott1-0/+8
Switch to the "marvell,armada-38x-uart" driver variant to empty the UART buffer before writing to the UART_LCR register. Signed-off-by: Joshua Scott <joshua.scott@alliedtelesis.co.nz> Tested-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>. Cc: stable@vger.kernel.org Fixes: 43e28ba87708 ("ARM: dts: Use armada-370-xp as a base for armada-xp-98dx3236") Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2019-06-27pinctrl: mediatek: Update cur_mask in mask/mask opsNicolas Boichat1-14/+4
During suspend/resume, mtk_eint_mask may be called while wake_mask is active. For example, this happens if a wake-source with an active interrupt handler wakes the system: irq/pm.c:irq_pm_check_wakeup would disable the interrupt, so that it can be handled later on in the resume flow. However, this may happen before mtk_eint_do_resume is called: in this case, wake_mask is loaded, and cur_mask is restored from an older copy, re-enabling the interrupt, and causing an interrupt storm (especially for level interrupts). Step by step, for a line that has both wake and interrupt enabled: 1. cur_mask[irq] = 1; wake_mask[irq] = 1; EINT_EN[irq] = 1 (interrupt enabled at hardware level) 2. System suspends, resumes due to that line (at this stage EINT_EN == wake_mask) 3. irq_pm_check_wakeup is called, and disables the interrupt => EINT_EN[irq] = 0, but we still have cur_mask[irq] = 1 4. mtk_eint_do_resume is called, and restores EINT_EN = cur_mask, so it reenables EINT_EN[irq] = 1 => interrupt storm as the driver is not yet ready to handle the interrupt. This patch fixes the issue in step 3, by recording all mask/unmask changes in cur_mask. This also avoids the need to read the current mask in eint_do_suspend, and we can remove mtk_eint_chip_read_mask function. The interrupt will be re-enabled properly later on, sometimes after mtk_eint_do_resume, when the driver is ready to handle it. Fixes: 58a5e1b64bb0 ("pinctrl: mediatek: Implement wake handler and suspend resume") Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Acked-by: Sean Wang <sean.wang@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-06-27proc: remove useless d_is_dir() checkChristian Brauner1-2/+1
Remove the d_is_dir() check from tgid_pidfd_to_pid(). It is pointless since you should never get &proc_tgid_base_operations for f_op on a non-directory. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Christian Brauner <christian@brauner.io>
2019-06-27copy_process(): don't use ksys_close() on cleanupsAl Viro1-28/+18
anon_inode_getfd() should be used *ONLY* in situations when we are guaranteed to be past the last failure point (including copying the descriptor number to userland, at that). And ksys_close() should not be used for cleanups at all. anon_inode_getfile() is there for all nontrivial cases like that. Just use that... Fixes: b3e583825266 ("clone: add CLONE_PIDFD") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Jann Horn <jannh@google.com> Signed-off-by: Christian Brauner <christian@brauner.io>
2019-06-27x86/entry: Simplify _TIF_SYSCALL_EMU handlingSudeep Holla1-11/+6
The usage of emulated and _TIF_SYSCALL_EMU flags in syscall_trace_enter is more complicated than required. Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-27cpu/hotplug: Fix out-of-bounds read when setting fail stateEiichi Tsukata1-0/+3
Setting invalid value to /sys/devices/system/cpu/cpuX/hotplug/fail can control `struct cpuhp_step *sp` address, results in the following global-out-of-bounds read. Reproducer: # echo -2 > /sys/devices/system/cpu/cpu0/hotplug/fail KASAN report: BUG: KASAN: global-out-of-bounds in write_cpuhp_fail+0x2cd/0x2e0 Read of size 8 at addr ffffffff89734438 by task bash/1941 CPU: 0 PID: 1941 Comm: bash Not tainted 5.2.0-rc6+ #31 Call Trace: write_cpuhp_fail+0x2cd/0x2e0 dev_attr_store+0x58/0x80 sysfs_kf_write+0x13d/0x1a0 kernfs_fop_write+0x2bc/0x460 vfs_write+0x1e1/0x560 ksys_write+0x126/0x250 do_syscall_64+0xc1/0x390 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f05e4f4c970 The buggy address belongs to the variable: cpu_hotplug_lock+0x98/0xa0 Memory state around the buggy address: ffffffff89734300: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00 ffffffff89734380: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00 >ffffffff89734400: 00 00 00 00 fa fa fa fa 00 00 00 00 fa fa fa fa ^ ffffffff89734480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffff89734500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Add a sanity check for the value written from user space. Fixes: 1db49484f21ed ("smp/hotplug: Hotplug state fail injection") Signed-off-by: Eiichi Tsukata <devel@etsukata.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Link: https://lkml.kernel.org/r/20190627024732.31672-1-devel@etsukata.com
2019-06-27af_packet: Block execution of tasks waiting for transmit to complete in ↵Neil Horman2-3/+18
AF_PACKET When an application is run that: a) Sets its scheduler to be SCHED_FIFO and b) Opens a memory mapped AF_PACKET socket, and sends frames with the MSG_DONTWAIT flag cleared, its possible for the application to hang forever in the kernel. This occurs because when waiting, the code in tpacket_snd calls schedule, which under normal circumstances allows other tasks to run, including ksoftirqd, which in some cases is responsible for freeing the transmitted skb (which in AF_PACKET calls a destructor that flips the status bit of the transmitted frame back to available, allowing the transmitting task to complete). However, when the calling application is SCHED_FIFO, its priority is such that the schedule call immediately places the task back on the cpu, preventing ksoftirqd from freeing the skb, which in turn prevents the transmitting task from detecting that the transmission is complete. We can fix this by converting the schedule call to a completion mechanism. By using a completion queue, we force the calling task, when it detects there are no more frames to send, to schedule itself off the cpu until such time as the last transmitted skb is freed, allowing forward progress to be made. Tested by myself and the reporter, with good results Change Notes: V1->V2: Enhance the sleep logic to support being interruptible and allowing for honoring to SK_SNDTIMEO (Willem de Bruijn) V2->V3: Rearrage the point at which we wait for the completion queue, to avoid needing to check for ph/skb being null at the end of the loop. Also move the complete call to the skb destructor to avoid needing to modify __packet_set_status. Also gate calling complete on packet_read_pending returning zero to avoid multiple calls to complete. (Willem de Bruijn) Move timeo computation within loop, to re-fetch the socket timeout since we also use the timeo variable to record the return code from the wait_for_complete call (Neil Horman) V3->V4: Willem has requested that the control flow be restored to the previous state. Doing so lets us eliminate the need for the po->wait_on_complete flag variable, and lets us get rid of the packet_next_frame function, but introduces another complexity. Specifically, but using the packet pending count, we can, if an applications calls sendmsg multiple times with MSG_DONTWAIT set, each set of transmitted frames, when complete, will cause tpacket_destruct_skb to issue a complete call, for which there will never be a wait_on_completion call. This imbalance will lead to any future call to wait_for_completion here to return early, when the frames they sent may not have completed. To correct this, we need to re-init the completion queue on every call to tpacket_snd before we enter the loop so as to ensure we wait properly for the frames we send in this iteration. Change the timeout and interrupted gotos to out_put rather than out_status so that we don't try to free a non-existant skb Clean up some extra newlines (Willem de Bruijn) Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>