summaryrefslogtreecommitdiff
path: root/include/linux/jump_label.h
AgeCommit message (Collapse)AuthorFilesLines
2019-06-17jump_label: Batch updates if arch supports itDaniel Bristot de Oliveira1-0/+3
If the architecture supports the batching of jump label updates, use it! An easy way to see the benefits of this patch is switching the schedstats on and off. For instance: -------------------------- %< ---------------------------- #!/bin/sh while [ true ]; do sysctl -w kernel.sched_schedstats=1 sleep 2 sysctl -w kernel.sched_schedstats=0 sleep 2 done -------------------------- >% ---------------------------- while watching the IPI count: -------------------------- %< ---------------------------- # watch -n1 "cat /proc/interrupts | grep Function" -------------------------- >% ---------------------------- With the current mode, it is possible to see +- 168 IPIs each 2 seconds, while with this patch the number of IPIs goes to 3 each 2 seconds. Regarding the performance impact of this patch set, I made two measurements: The time to update a key (the task that is causing the change) The time to run the int3 handler (the side effect on a thread that hits the code being changed) The schedstats static key was chosen as the key to being switched on and off. The reason being is that it is used in more than 56 places, in a hot path. The change in the schedstats static key will be done with the following command: while [ true ]; do sysctl -w kernel.sched_schedstats=1 usleep 500000 sysctl -w kernel.sched_schedstats=0 usleep 500000 done In this way, they key will be updated twice per second. To force the hit of the int3 handler, the system will also run a kernel compilation with two jobs per CPU. The test machine is a two nodes/24 CPUs box with an Intel Xeon processor @2.27GHz. Regarding the update part, on average, the regular kernel takes 57 ms to update the schedstats key, while the kernel with the batch updates takes just 1.4 ms on average. Although it seems to be too good to be true, it makes sense: the schedstats key is used in 56 places, so it was expected that it would take around 56 times to update the keys with the current implementation, as the IPIs are the most expensive part of the update. Regarding the int3 handler, the non-batch handler takes 45 ns on average, while the batch version takes around 180 ns. At first glance, it seems to be a high value. But it is not, considering that it is doing 56 updates, rather than one! It is taking four times more, only. This gain is possible because the patch uses a binary search in the vector: log2(56)=5.8. So, it was expected to have an overhead within four times. (voice of tv propaganda) But, that is not all! As the int3 handler keeps on for a shorter period (because the update part is on for a shorter time), the number of hits in the int3 handler decreased by 10%. The question then is: Is it worth paying the price of "135 ns" more in the int3 handler? Considering that, in this test case, we are saving the handling of 53 IPIs, that takes more than these 135 ns, it seems to be a meager price to be paid. Moreover, the test case was forcing the hit of the int3, in practice, it does not take that often. While the IPI takes place on all CPUs, hitting the int3 handler or not! For instance, in an isolated CPU with a process running in user-space (nohz_full use-case), the chances of hitting the int3 handler is barely zero, while there is no way to avoid the IPIs. By bounding the IPIs, we are improving a lot this scenario. Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris von Recklinghausen <crecklin@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott Wood <swood@redhat.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/acc891dbc2dbc9fd616dd680529a2337b1d1274c.1560325897.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-06jump_label: move 'asm goto' support test to KconfigMasahiro Yamada1-13/+9
Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label". The jump label is controlled by HAVE_JUMP_LABEL, which is defined like this: #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) # define HAVE_JUMP_LABEL #endif We can improve this by testing 'asm goto' support in Kconfig, then make JUMP_LABEL depend on CC_HAS_ASM_GOTO. Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will match to the real kernel capability. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
2018-09-27jump_label: Annotate entries that operate on __init code earlierArd Biesheuvel1-7/+4
Jump table entries are mostly read-only, with the exception of the init and module loader code that defuses entries that point into init code when the code being referred to is freed. For robustness, it would be better to move these entries into the ro_after_init section, but clearing the 'code' member of each jump table entry referring to init code at module load time races with the module_enable_ro() call that remaps the ro_after_init section read only, so we'd like to do it earlier. So given that whether such an entry refers to init code can be decided much earlier, we can pull this check forward. Since we may still need the code entry at this point, let's switch to setting a low bit in the 'key' member just like we do to annotate the default state of a jump table entry. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jessica Yu <jeyu@kernel.org> Link: https://lkml.kernel.org/r/20180919065144.25010-8-ard.biesheuvel@linaro.org
2018-09-27jump_label: Implement generic support for relative referencesArd Biesheuvel1-0/+28
To reduce the size taken up by absolute references in jump label entries themselves and the associated relocation records in the .init segment, add support for emitting them as relative references instead. Note that this requires some extra care in the sorting routine, given that the offsets change when entries are moved around in the jump_entry table. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jessica Yu <jeyu@kernel.org> Link: https://lkml.kernel.org/r/20180919065144.25010-3-ard.biesheuvel@linaro.org
2018-09-27jump_label: Abstract jump_entry member accessorsArd Biesheuvel1-0/+34
In preparation of allowing architectures to use relative references in jump_label entries [which can dramatically reduce the memory footprint], introduce abstractions for references to the 'code' and 'key' members of struct jump_entry. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jessica Yu <jeyu@kernel.org> Link: https://lkml.kernel.org/r/20180919065144.25010-2-ard.biesheuvel@linaro.org
2018-07-04usercopy: Allow boot cmdline disabling of hardeningChris von Recklinghausen1-0/+6
Enabling HARDENED_USERCOPY may cause measurable regressions in networking performance: up to 8% under UDP flood. I ran a small packet UDP flood using pktgen vs. a host b2b connected. On the receiver side the UDP packets are processed by a simple user space process that just reads and drops them: https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c Not very useful from a functional PoV, but it helps to pin-point bottlenecks in the networking stack. When running a kernel with CONFIG_HARDENED_USERCOPY=y, I see a 5-8% regression in the receive tput, compared to the same kernel without this option enabled. With CONFIG_HARDENED_USERCOPY=y, perf shows ~6% of CPU time spent cumulatively in __check_object_size (~4%) and __virt_addr_valid (~2%). The call-chain is: __GI___libc_recvfrom entry_SYSCALL_64_after_hwframe do_syscall_64 __x64_sys_recvfrom __sys_recvfrom inet_recvmsg udp_recvmsg __check_object_size udp_recvmsg() actually calls copy_to_iter() (inlined) and the latters calls check_copy_size() (again, inlined). A generic distro may want to enable HARDENED_USERCOPY in their default kernel config, but at the same time, such distro may want to be able to avoid the performance penalties in with the default configuration and disable the stricter check on a per-boot basis. This change adds a boot parameter that conditionally disables HARDENED_USERCOPY via "hardened_usercopy=off". Signed-off-by: Chris von Recklinghausen <crecklin@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2018-03-20jump_label: Disable jump labels in __exit codeJosh Poimboeuf1-2/+2
With the following commit: 333522447063 ("jump_label: Explicitly disable jump labels in __init code") ... we explicitly disabled jump labels in __init code, so they could be detected and not warned about in the following commit: dc1dd184c2f0 ("jump_label: Warn on failed jump_label patching attempt") In-kernel __exit code has the same issue. It's never used, so it's freed along with the rest of initmem. But jump label entries in __exit code aren't explicitly disabled, so we get the following warning when enabling pr_debug() in __exit code: can't patch jump_label at dmi_sysfs_exit+0x0/0x2d WARNING: CPU: 0 PID: 22572 at kernel/jump_label.c:376 __jump_label_update+0x9d/0xb0 Fix the warning by disabling all jump labels in initmem (which includes both __init and __exit code). Reported-and-tested-by: Li Wang <liwang@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: dc1dd184c2f0 ("jump_label: Warn on failed jump_label patching attempt") Link: http://lkml.kernel.org/r/7121e6e595374f06616c505b6e690e275c0054d1.1521483452.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21jump_label: Explicitly disable jump labels in __init codeJosh Poimboeuf1-0/+3
After initmem has been freed, any jump labels in __init code are prevented from being written to by the kernel_text_address() check in __jump_label_update(). However, this check is quite broad. If kernel_text_address() were to return false for any other reason, the jump label write would fail silently with no warning. For jump labels in module init code, entry->code is set to zero to indicate that the entry is disabled. Do the same thing for core kernel init code. This makes the behavior more consistent, and will also make it more straightforward to detect non-init jump label write failures in the next patch. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/c52825c73f3a174e8398b6898284ec20d4deb126.1519051220.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06jump_label: Add branch hints to static_branch_{un,}likely()Peter Zijlstra1-2/+2
For some reason these were missing, I've not observed this patch making a difference in the few code locations I checked, but this makes sense. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-24sched/core: Fix cpu.max vs. cpuhotplug deadlockPeter Zijlstra1-0/+7
Tejun reported the following cpu-hotplug lock (percpu-rwsem) read recursion: tg_set_cfs_bandwidth() get_online_cpus() cpus_read_lock() cfs_bandwidth_usage_inc() static_key_slow_inc() cpus_read_lock() Reported-by: Tejun Heo <tj@kernel.org> Tested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180122215328.GP3397@worktop Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07Merge branch 'linus' into locking/core, to resolve conflictsIngo Molnar1-0/+1
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-19locking/static_keys: Improve uninitialized key warningBorislav Petkov1-7/+7
Right now it says: static_key_disable_cpuslocked used before call to jump_label_init ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:161 static_key_disable_cpuslocked+0x68/0x70 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.0-rc5+ #1 Hardware name: SGI.COM C2112-4GP3/X10DRT-P-Series, BIOS 2.0a 05/09/2016 task: ffffffff81c0e480 task.stack: ffffffff81c00000 RIP: 0010:static_key_disable_cpuslocked+0x68/0x70 RSP: 0000:ffffffff81c03ef0 EFLAGS: 00010096 ORIG_RAX: 0000000000000000 RAX: 0000000000000041 RBX: ffffffff81c32680 RCX: ffffffff81c5cbf8 RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000002 RBP: ffff88807fffd240 R08: 726f666562206465 R09: 0000000000000136 R10: 0000000000000000 R11: 696e695f6c656261 R12: ffffffff82158900 R13: ffffffff8215f760 R14: 0000000000000001 R15: 0000000000000008 FS: 0000000000000000(0000) GS:ffff883f7f400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88807ffff000 CR3: 0000000001c09000 CR4: 00000000000606b0 Call Trace: static_key_disable+0x16/0x20 start_kernel+0x15a/0x45d ? load_ucode_intel_bsp+0x11/0x2d secondary_startup_64+0xa5/0xb0 Code: 48 c7 c7 a0 15 cf 81 e9 47 53 4b 00 48 89 df e8 5f fc ff ff eb e8 48 c7 c6 \ c0 97 83 81 48 c7 c7 d0 ff a2 81 31 c0 e8 c5 9d f5 ff <0f> ff eb a7 0f ff eb \ b0 e8 eb a2 4b 00 53 48 89 fb e8 42 0e f0 but it doesn't tell me which key it is. So dump the key's name too: static_key_disable_cpuslocked(): static key 'virt_spin_lock_key' used before call to jump_label_init() And that makes pinpointing which key is causing that a lot easier. include/linux/jump_label.h | 14 +++++++------- include/linux/jump_label_ratelimit.h | 6 +++--- kernel/jump_label.c | 14 +++++++------- 3 files changed, 17 insertions(+), 17 deletions(-) Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Jason Baron <jbaron@akamai.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171018152428.ffjgak4o25f7ept6@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10jump_label: Provide hotplug context variantsMarc Zyngier1-2/+9
As using the normal static key API under the hotplug lock is pretty much impossible, let's provide a variant of some of them that require the hotplug lock to have already been taken. These function are only meant to be used in CPU hotplug callbacks. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20170801080257.5056-4-marc.zyngier@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10jump_label: Fix concurrent static_key_enable/disable()Paolo Bonzini1-10/+12
static_key_enable/disable are trying to cap the static key count to 0/1. However, their use of key->enabled is outside jump_label_lock so they do not really ensure that. Rewrite them to do a quick check for an already enabled (respectively, already disabled), and then recheck under the jump label lock. Unlike static_key_slow_inc/dec, a failed check under the jump label lock does not modify key->enabled. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1501601046-35683-2-git-send-email-pbonzini@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-07Merge tag 'trace-v4.11-rc1' of ↵Linus Torvalds1-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "There was some breakage with the changes for jump labels in the 4.11 merge window: - powerpc broke as jump labels uses the two LSB bits as flags in initialization. A check was added to make sure that all jump label entries were 4 bytes aligned, but powerpc didn't work that way for modules. Adding an alignment in the module linker script appeared to be the best solution. - Jump labels also added an anonymous union to access those LSB bits as a normal long. But because this structure had static initialization, it broke older compilers that could not statically initialize anonymous unions without brackets. - The command line parameter for setting function graph filter broke the "EMPTY_HASH" descriptor by modifying it instead of creating a new hash to hold the entries. - The command line parameter ftrace_graph_max_depth was added to allow its setting at boot time. It uses existing code and only the command line hook was added. This is not really a fix, but as it uses existing code without affecting anything else, I added it to this release. It was ready before the merge window closed, but I wanted to let it sit in linux-next for a couple of days first" * tag 'trace-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace/graph: Add ftrace_graph_max_depth kernel parameter tracing: Add #undef to fix compile error jump_label: Add comment about initialization order for anonymous unions jump_label: Fix anonymous union initialization module: set __jump_table alignment to 8 ftrace/graph: Do not modify the EMPTY_HASH for the function_graph filter tracing: Fix code comment for ftrace_ops_get_func()
2017-03-03jump_label: Add comment about initialization order for anonymous unionsSteven Rostedt (VMware)1-0/+7
Commit 3821fd35b58d ("jump_label: Reduce the size of struct static_key") broke old compilers that could not handle static initialization of anonymous unions. Boris fixed it with a patch that added brackets around the static initializer. But this creates a dependency between those initializers and the structure's order of its fields. Document this dependency in case new fields are added to struct static_key in the future. Noted-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Suggested-by: Chris Mason <clm@fb.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-03-03jump_label: Fix anonymous union initializationBoris Ostrovsky1-2/+2
Pre-4.6 gcc do not allow direct static initialization of members of anonymous structs/unions. After commit 3821fd35b58d ("jump_label: Reduce the size of struct static_key") STATIC_KEY_INIT_{TRUE|FALSE} definitions cannot be compiled with those older compilers. Placing initializers inside curved brackets works around this problem. Link: http://lkml.kernel.org/r/1488299542-30765-1-git-send-email-boris.ostrovsky@oracle.com Fixes: 3821fd35b58d ("jump_label: Reduce the size of struct static_key") Reviewed-by: Jason Baron <jbaron@akamai.com> Compiled-by: Chris Mason <clm@fb.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-28Merge tag 'trace-v4.11' of ↵Linus Torvalds1-8/+15
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "This release has no new tracing features, just clean ups, minor fixes and small optimizations" * tag 'trace-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (25 commits) tracing: Remove outdated ring buffer comment tracing/probes: Fix a warning message to show correct maximum length tracing: Fix return value check in trace_benchmark_reg() tracing: Use modern function declaration jump_label: Reduce the size of struct static_key tracing/probe: Show subsystem name in messages tracing/hwlat: Update old comment about migration timers: Make flags output in the timer_start tracepoint useful tracing: Have traceprobe_probes_write() not access userspace unnecessarily tracing: Have COMM event filter key be treated as a string ftrace: Have set_graph_function handle multiple functions in one write ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock tracing: Reset parser->buffer to allow multiple "puts" ftrace: Have set_graph_functions handle write with RDWR ftrace: Reset fgd->hash in ftrace_graph_write() ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY ftrace: Create a slight optimization on searching the ftrace_hash tracing: Add ftrace_hash_key() helper function ftrace: Convert graph filter to use hash tables ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip ...
2017-02-15jump_label: Reduce the size of struct static_keyJason Baron1-8/+15
The static_key->next field goes mostly unused. The field is used for associating module uses with a static key. Most uses of struct static_key define a static key in the core kernel and make use of it entirely within the core kernel, or define the static key in a module and make use of it only from within that module. In fact, of the ~3,000 static keys defined, I found only about 5 or so that did not fit this pattern. Thus, we can remove the static_key->next field entirely and overload the static_key->entries field. That is, when all the static_key uses are contained within the same module, static_key->entries continues to point to those uses. However, if the static_key uses are not contained within the module where the static_key is defined, then we allocate a struct static_key_mod, store a pointer to the uses within that struct static_key_mod, and have the static key point at the static_key_mod. This does incur some extra memory usage when a static_key is used in a module that does not define it, but since there are only a handful of such cases there is a net savings. In order to identify if the static_key->entries pointer contains a struct static_key_mod or a struct jump_entry pointer, bit 1 of static_key->entries is set to 1 if it points to a struct static_key_mod and is 0 if it points to a struct jump_entry. We were already using bit 0 in a similar way to store the initial value of the static_key. This does mean that allocations of struct static_key_mod and that the struct jump_entry tables need to be at least 4-byte aligned in memory. As far as I can tell all arches meet this criteria. For my .config, the patch increased the text by 778 bytes, but reduced the data + bss size by 14912, for a net savings of 14,134 bytes. text data bss dec hex filename 8092427 5016512 790528 13899467 d416cb vmlinux.pre 8093205 5001600 790528 13885333 d3df95 vmlinux.post Link: http://lkml.kernel.org/r/1486154544-4321-1-git-send-email-jbaron@akamai.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-01-19jump_labels: Move header guard #endif down where it belongsLuis R. Rodriguez1-2/+2
The ending header guard is misplaced. This has no functional change, this is just an eye-sore. Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: bp@suse.de Cc: catalin.marinas@arm.com Cc: jbaron@akamai.com Cc: pbonzini@redhat.com Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/20170118173804.16281-1-mcgrof@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-03Merge branch 'ras-core-for-linus' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "The main changes were: - Lots of enhancements for AMD SMCA (Scalable MCA features/extensions) systems: extract, decode and print more hardware error information and add matching support on the injection/testing side as well. (Yazn Ghannam) - Various MCE handling improvements on modern Intel Xeons. (Tony Luck) - Plus misc fixes and enhancements" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/RAS/mce_amd_inj: Remove debugfs dir recursively on exit x86/RAS/mce_amd_inj: Fix signed wrap around when decrementing index 'i' x86/RAS/mce_amd_inj: Fix some W= warnings x86/MCE/AMD, EDAC: Handle reserved bank 4 on Fam17h properly x86/mce/AMD: Extract the error address on SMCA systems x86/mce, EDAC/mce_amd: Print MCA_SYND and MCA_IPID during MCE on SMCA systems x86/mce/AMD: Save MCA_IPID in MCE struct on SMCA systems x86/mce/AMD: Ensure the deferred error interrupt is of type APIC on SMCA systems x86/mce/AMD: Update sysfs bank names for SMCA systems x86/mce/AMD, EDAC/mce_amd: Define and use tables for known SMCA IP types EDAC/mce_amd: Use SMCA prefix for error descriptions arrays EDAC/mce_amd: Add missing SMCA error descriptions x86/mce/AMD: Read MSRs on the CPU allocating the threshold blocks x86/RAS: Add syndrome support to mce_amd_inj EDAC/mce_amd: Print syndrome register value on SMCA systems x86/mce: Add support for new MCA_SYND register x86/mce/AMD: Use msr_ops.misc() in allocate_threshold_blocks() x86/mce: Drop X86_FEATURE_MCE_RECOVERY and the related model string test x86/mce: Improve memcpy_mcsafe() x86/mce: Add PCI quirks to identify Xeons with machine check recovery ...
2016-09-07jump_labels: Allow array initialisersCatalin Marinas1-0/+12
The static key API is currently designed around single variable definitions. There are cases where an array of static keys is desirable, so extend the API to allow this rather than using the internal static key implementation directly. Cc: Jason Baron <jbaron@akamai.com> Cc: Jonathan Corbet <corbet@lwn.net> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Dave P Martin <Dave.Martin@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-05locking/static_keys: Provide DECLARE and well as DEFINE macrosTony Luck1-0/+6
We will need to provide declarations of static keys in header files. Provide DECLARE_STATIC_KEY_{TRUE,FALSE} macros. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/816881cf85bd3cf13385d212882618f38a3b5d33.1472754711.git.tony.luck@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-08-04jump_label: remove bug.h, atomic.h dependencies for HAVE_JUMP_LABELJason Baron1-23/+23
The current jump_label.h includes bug.h for things such as WARN_ON(). This makes the header problematic for inclusion by kernel.h or any headers that kernel.h includes, since bug.h includes kernel.h (circular dependency). The inclusion of atomic.h is similarly problematic. Thus, this should make jump_label.h 'includable' from most places. Link: http://lkml.kernel.org/r/7060ce35ddd0d20b33bf170685e6b0fab816bdf2.1467837322.git.jbaron@akamai.com Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Joe Perches <joe@perches.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-24locking/static_key: Fix concurrent static_key_slow_inc()Paolo Bonzini1-3/+13
The following scenario is possible: CPU 1 CPU 2 static_key_slow_inc() atomic_inc_not_zero() -> key.enabled == 0, no increment jump_label_lock() atomic_inc_return() -> key.enabled == 1 now static_key_slow_inc() atomic_inc_not_zero() -> key.enabled == 1, inc to 2 return ** static key is wrong! jump_label_update() jump_label_unlock() Testing the static key at the point marked by (**) will follow the wrong path for jumps that have not been patched yet. This can actually happen when creating many KVM virtual machines with userspace LAPIC emulation; just run several copies of the following program: #include <fcntl.h> #include <unistd.h> #include <sys/ioctl.h> #include <linux/kvm.h> int main(void) { for (;;) { int kvmfd = open("/dev/kvm", O_RDONLY); int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0); close(ioctl(vmfd, KVM_CREATE_VCPU, 1)); close(vmfd); close(kvmfd); } return 0; } Every KVM_CREATE_VCPU ioctl will attempt a static_key_slow_inc() call. The static key's purpose is to skip NULL pointer checks and indeed one of the processes eventually dereferences NULL. As explained in the commit that introduced the bug: 706249c222f6 ("locking/static_keys: Rework update logic") jump_label_update() needs key.enabled to be true. The solution adopted here is to temporarily make key.enabled == -1, and use go down the slow path when key.enabled <= 0. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> # v4.3+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 706249c222f6 ("locking/static_keys: Rework update logic") Link: http://lkml.kernel.org/r/1466527937-69798-1-git-send-email-pbonzini@redhat.com [ Small stylistic edits to the changelog and the code. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-23treewide: Remove old email addressPeter Zijlstra1-1/+1
There were still a number of references to my old Red Hat email address in the kernel source. Remove these while keeping the Red Hat copyright notices intact. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-06Merge branch 'for-4.4' of ↵Linus Torvalds1-7/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "The cgroup core saw several significant updates this cycle: - percpu_rwsem for threadgroup locking is reinstated. This was temporarily dropped due to down_write latency issues. Oleg's rework of percpu_rwsem which is scheduled to be merged in this merge window resolves the issue. - On the v2 hierarchy, when controllers are enabled and disabled, all operations are atomic and can fail and revert cleanly. This allows ->can_attach() failure which is necessary for cpu RT slices. - Tasks now stay associated with the original cgroups after exit until released. This allows tracking resources held by zombies (e.g. pids) and makes it easy to find out where zombies came from on the v2 hierarchy. The pids controller was broken before these changes as zombies escaped the limits; unfortunately, updating this behavior required too many invasive changes and I don't think it's a good idea to backport them, so the pids controller on 4.3, the first version which included the pids controller, will stay broken at least until I'm sure about the cgroup core changes. - Optimization of a couple common tests using static_key" * 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (38 commits) cgroup: fix race condition around termination check in css_task_iter_next() blkcg: don't create "io.stat" on the root cgroup cgroup: drop cgroup__DEVEL__legacy_files_on_dfl cgroup: replace error handling in cgroup_init() with WARN_ON()s cgroup: add cgroup_subsys->free() method and use it to fix pids controller cgroup: keep zombies associated with their original cgroups cgroup: make css_set_rwsem a spinlock and rename it to css_set_lock cgroup: don't hold css_set_rwsem across css task iteration cgroup: reorganize css_task_iter functions cgroup: factor out css_set_move_task() cgroup: keep css_set and task lists in chronological order cgroup: make cgroup_destroy_locked() test cgroup_is_populated() cgroup: make css_sets pin the associated cgroups cgroup: relocate cgroup_[try]get/put() cgroup: move check_for_release() invocation cgroup: replace cgroup_has_tasks() with cgroup_is_populated() cgroup: make cgroup->nr_populated count the number of populated css_sets cgroup: remove an unused parameter from cgroup_task_migrate() cgroup: fix too early usage of static_branch_disable() cgroup: make cgroup_update_dfl_csses() migrate all target processes atomically ...
2015-09-18jump_label: make static_key_enabled() work on static_key_true/false types tooTejun Heo1-7/+11
static_key_enabled() can be used on struct static_key but not on its wrapper types static_key_true and static_key_false. The function is useful for debugging and management of static keys. Update it so that it can be used for the wrapper types too. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org>
2015-09-15locking/static_keys: Fix up the static keys documentationJonathan Corbet1-6/+4
Fix a few small mistakes in the static key documentation and delete an unneeded sentence. Suggested-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150914171105.511e1e21@lwn.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-08locking/static_keys: Fix a silly typoJonathan Corbet1-1/+1
Commit: 412758cb2670 ("jump label, locking/static_keys: Update docs") introduced a typo that might as well get fixed. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Baron <jbaron@akamai.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150907131803.54c027e1@lwn.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03jump label, locking/static_keys: Update docsJason Baron1-21/+46
Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: benh@kernel.crashing.org Cc: bp@alien8.de Cc: davem@davemloft.net Cc: ddaney@caviumnetworks.com Cc: heiko.carstens@de.ibm.com Cc: linux-kernel@vger.kernel.org Cc: liuj97@gmail.com Cc: luto@amacapital.net Cc: michael@ellerman.id.au Cc: rabin@rab.in Cc: ralf@linux-mips.org Cc: rostedt@goodmis.org Cc: vbabka@suse.cz Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/6b50f2f6423a2244f37f4b1d2d6c211b9dcdf4f8.1438227999.git.jbaron@akamai.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03locking/static_keys: Add a new static_key interfacePeter Zijlstra1-10/+139
There are various problems and short-comings with the current static_key interface: - static_key_{true,false}() read like a branch depending on the key value, instead of the actual likely/unlikely branch depending on init value. - static_key_{true,false}() are, as stated above, tied to the static_key init values STATIC_KEY_INIT_{TRUE,FALSE}. - we're limited to the 2 (out of 4) possible options that compile to a default NOP because that's what our arch_static_branch() assembly emits. So provide a new static_key interface: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); Which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. This means adding a second arch_static_branch_jump() assembly helper which emits a JMP per default. In order to determine the right instruction for the right state, encode the branch type in the LSB of jump_entry::key. This is the final step in removing the naming confusion that has led to a stream of avoidable bugs such as: a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()") ... but it also allows new static key combinations that will give us performance enhancements in the subsequent patches. Tested-by: Rabin Vincent <rabin@rab.in> # arm Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03locking/static_keys: Add static_key_{en,dis}able() helpersPeter Zijlstra1-0/+20
Add two helpers to make it easier to treat the refcount as boolean. Suggested-by: Jason Baron <jasonbaron0@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03jump_label, locking/static_keys: Rename JUMP_LABEL_TYPE_* and related ↵Peter Zijlstra1-20/+5
helpers to the static_key* pattern Rename the JUMP_LABEL_TYPE_* macros to be JUMP_TYPE_* and move the inline helpers into kernel/jump_label.c, since that's the only place they're ever used. Also rename the helpers where it's all about static keys. This is the second step in removing the naming confusion that has led to a stream of avoidable bugs such as: a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03jump_label: Rename JUMP_LABEL_{EN,DIS}ABLE to JUMP_LABEL_{JMP,NOP}Peter Zijlstra1-2/+2
Since we've already stepped away from ENABLE is a JMP and DISABLE is a NOP with the branch_default bits, and are going to make it even worse, rename it to make it all clearer. This way we don't mix multiple levels of logic attributes, but have a plain 'physical' name for what the current instruction patching status of a jump label is. This is a first step in removing the naming confusion that has led to a stream of avoidable bugs such as: a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org [ Beefed up the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-09jump_label: Allow jump labels to be used in assemblyAnton Blanchard1-4/+17
To use jump labels in assembly we need the HAVE_JUMP_LABEL define, so we select a fallback version if the toolchain does not support them. Modify linux/jump_label.h so it can be included by assembly files. We also need to add -DCC_HAVE_ASM_GOTO to KBUILD_AFLAGS. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: benh@kernel.crashing.org Cc: catalin.marinas@arm.com Cc: davem@davemloft.net Cc: heiko.carstens@de.ibm.com Cc: jbaron@akamai.com Cc: linux@arm.linux.org.uk Cc: linuxppc-dev@lists.ozlabs.org Cc: liuj97@gmail.com Cc: mgorman@suse.de Cc: mmarek@suse.cz Cc: mpe@ellerman.id.au Cc: paulus@samba.org Cc: ralf@linux-mips.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1428551492-21977-2-git-send-email-anton@samba.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-08-10jump_label: Fix small typos in the documentationIngo Molnar1-9/+8
Was reading through the documentation of this code and noticed a few typos, missing commas, etc. Cc: Jason Baron <jbaron@akamai.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mel Gorman <mgorman@suse.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05include/linux/jump_label.h: expose the reference countMel Gorman1-7/+13
This patch exposes the jump_label reference count in preparation for the next patch. cpusets cares about both the jump_label being enabled and how many users of the cpusets there currently are. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-08jump_label: use defined macros instead of hard-coding for better readabilityJiang Liu1-7/+12
Use macro JUMP_LABEL_TRUE_BRANCH instead of hard-coding for better readability. Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-11-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-0/+10
Pull networking updates from David Miller: 1) The addition of nftables. No longer will we need protocol aware firewall filtering modules, it can all live in userspace. At the core of nftables is a, for lack of a better term, virtual machine that executes byte codes to inspect packet or metadata (arriving interface index, etc.) and make verdict decisions. Besides support for loading packet contents and comparing them, the interpreter supports lookups in various datastructures as fundamental operations. For example sets are supports, and therefore one could create a set of whitelist IP address entries which have ACCEPT verdicts attached to them, and use the appropriate byte codes to do such lookups. Since the interpreted code is composed in userspace, userspace can do things like optimize things before giving it to the kernel. Another major improvement is the capability of atomically updating portions of the ruleset. In the existing netfilter implementation, one has to update the entire rule set in order to make a change and this is very expensive. Userspace tools exist to create nftables rules using existing netfilter rule sets, but both kernel implementations will need to co-exist for quite some time as we transition from the old to the new stuff. Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have worked so hard on this. 2) Daniel Borkmann and Hannes Frederic Sowa made several improvements to our pseudo-random number generator, mostly used for things like UDP port randomization and netfitler, amongst other things. In particular the taus88 generater is updated to taus113, and test cases are added. 3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet and Yang Yingliang. 4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin Sujir. 5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet, Neal Cardwell, and Yuchung Cheng. 6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary control message data, much like other socket option attributes. From Francesco Fusco. 7) Allow applications to specify a cap on the rate computed automatically by the kernel for pacing flows, via a new SO_MAX_PACING_RATE socket option. From Eric Dumazet. 8) Make the initial autotuned send buffer sizing in TCP more closely reflect actual needs, from Eric Dumazet. 9) Currently early socket demux only happens for TCP sockets, but we can do it for connected UDP sockets too. Implementation from Shawn Bohrer. 10) Refactor inet socket demux with the goal of improving hash demux performance for listening sockets. With the main goals being able to use RCU lookups on even request sockets, and eliminating the listening lock contention. From Eric Dumazet. 11) The bonding layer has many demuxes in it's fast path, and an RCU conversion was started back in 3.11, several changes here extend the RCU usage to even more locations. From Ding Tianhong and Wang Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav Falico. 12) Allow stackability of segmentation offloads to, in particular, allow segmentation offloading over tunnels. From Eric Dumazet. 13) Significantly improve the handling of secret keys we input into the various hash functions in the inet hashtables, TCP fast open, as well as syncookies. From Hannes Frederic Sowa. The key fundamental operation is "net_get_random_once()" which uses static keys. Hannes even extended this to ipv4/ipv6 fragmentation handling and our generic flow dissector. 14) The generic driver layer takes care now to set the driver data to NULL on device removal, so it's no longer necessary for drivers to explicitly set it to NULL any more. Many drivers have been cleaned up in this way, from Jingoo Han. 15) Add a BPF based packet scheduler classifier, from Daniel Borkmann. 16) Improve CRC32 interfaces and generic SKB checksum iterators so that SCTP's checksumming can more cleanly be handled. Also from Daniel Borkmann. 17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces using the interface MTU value. This helps avoid PMTU attacks, particularly on DNS servers. From Hannes Frederic Sowa. 18) Use generic XPS for transmit queue steering rather than internal (re-)implementation in virtio-net. From Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits) random32: add test cases for taus113 implementation random32: upgrade taus88 generator to taus113 from errata paper random32: move rnd_state to linux/random.h random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized random32: add periodic reseeding random32: fix off-by-one in seeding requirement PHY: Add RTL8201CP phy_driver to realtek xtsonic: add missing platform_set_drvdata() in xtsonic_probe() macmace: add missing platform_set_drvdata() in mace_probe() ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe() ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh vlan: Implement vlan_dev_get_egress_qos_mask as an inline. ixgbe: add warning when max_vfs is out of range. igb: Update link modes display in ethtool netfilter: push reasm skb through instead of original frag skbs ip6_output: fragment outgoing reassembled skb properly MAINTAINERS: mv643xx_eth: take over maintainership from Lennart net_sched: tbf: support of 64bit rates ixgbe: deleting dfwd stations out of order can cause null ptr deref ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS ...
2013-11-13jump_label: unlikely(x) > 0Roel Kluin1-2/+2
if (unlikely(x) > 0) doesn't seem to help branch prediction Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: "H. Peter Anvin" <hpa@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-20static_key: WARN on usage before jump_label_init was calledHannes Frederic Sowa1-0/+10
Usage of the static key primitives to toggle a branch must not be used before jump_label_init() is called from init/main.c. jump_label_init reorganizes and wires up the jump_entries so usage before that could have unforeseen consequences. Following primitives are now checked for correct use: * static_key_slow_inc * static_key_slow_dec * static_key_slow_dec_deferred * jump_label_rate_limit The x86 architecture already checks this by testing if the default_nop was already replaced with an optimal nop or with a branch instruction. It will panic then. Other architectures don't check for this. Because we need to relax this check for the x86 arch to allow code to transition from default_nop to the enabled state and other architectures did not check for this at all this patch introduces checking on the static_key primitives in a non-arch dependent manner. All checked functions are considered slow-path so the additional check does no harm to performance. The warnings are best observed with earlyprintk. Based on a patch from Andi Kleen. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-09jump_label: Split jumplabel ratelimitAndrew Jones1-27/+1
Commit b202952075f62603bea9bfb6ebc6b0420db11949 ("perf, core: Rate limit perf_sched_events jump_label patching") introduced rate limiting for jump label disabling. The changes were made in the jump label code in order to be more widely available and to keep things tidier. This is all fine, except now jump_label.h includes linux/workqueue.h, which makes it impossible to include jump_label.h from anything that workqueue.h needs. For example, it's now impossible to include jump_label.h from asm/spinlock.h, which is done in proposed pv-ticketlock patches. This patch splits out the rate limiting related changes from jump_label.h into a new file, jump_label_ratelimit.h, to resolve the issue. Signed-off-by: Andrew Jones <drjones@redhat.com> Link: http://lkml.kernel.org/r/1376058122-8248-10-git-send-email-raghavendra.kt@linux.vnet.ibm.com Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-07-06jump label: Remove static_branch()Jason Baron1-16/+1
Remove the obsolete static_branch() interface, since the supported interface is now static_key_false()/true() - which is used by all in-tree code. See commit: c5905afb0e ("static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]()"). Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/199332c47eef8005d5a5bf1018a80d25929a5746.1340909155.git.jbaron@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-02-28static keys: Inline the static_key_enabled() functionJason Baron1-6/+5
In the jump label enabled case, calling static_key_enabled() results in a function call. The function returns the results of a compare, so it really doesn't need the overhead of a full function call. Let's make it 'static inline' for both the jump label enabled and disabled cases. Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: a.p.zijlstra@chello.nl Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@polymtl.ca Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/201202281849.q1SIn1p2023270@int-mx10.intmail.prod.int.phx2.redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-24static keys: Introduce 'struct static_key', static_key_true()/false() and ↵Ingo Molnar1-39/+100
static_key_slow_[inc|dec]() So here's a boot tested patch on top of Jason's series that does all the cleanups I talked about and turns jump labels into a more intuitive to use facility. It should also address the various misconceptions and confusions that surround jump labels. Typical usage scenarios: #include <linux/static_key.h> struct static_key key = STATIC_KEY_INIT_TRUE; if (static_key_false(&key)) do unlikely code else do likely code Or: if (static_key_true(&key)) do likely code else do unlikely code The static key is modified via: static_key_slow_inc(&key); ... static_key_slow_dec(&key); The 'slow' prefix makes it abundantly clear that this is an expensive operation. I've updated all in-kernel code to use this everywhere. Note that I (intentionally) have not pushed through the rename blindly through to the lowest levels: the actual jump-label patching arch facility should be named like that, so we want to decouple jump labels from the static-key facility a bit. On non-jump-label enabled architectures static keys default to likely()/unlikely() branches. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jason Baron <jbaron@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: a.p.zijlstra@chello.nl Cc: mathieu.desnoyers@efficios.com Cc: davem@davemloft.net Cc: ddaney.cavm@gmail.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.hu Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-27jump_label: Add some documentationPeter Zijlstra1-0/+32
akpm figured we could do with a blub explaining what static_branch() is and why it lives... Grumpily-requested-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-h02wu6kabpoojxf03wke704k@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06jump_label: Provide jump_label_key initializersPeter Zijlstra1-0/+3
Provide two initializers for jump_label_key that initialize it enabled or disabled. Also modify all jump_label code to allow for jump_labels to be initialized enabled. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jason Baron <jbaron@redhat.com> Link: http://lkml.kernel.org/n/tip-p40e3yj21b68y03z1yv825e7@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06perf, core: Rate limit perf_sched_events jump_label patchingGleb Natapov1-0/+24
jump_lable patching is very expensive operation that involves pausing all cpus. The patching of perf_sched_events jump_label is easily controllable from userspace by unprivileged user. When te user runs a loop like this: "while true; do perf stat -e cycles true; done" ... the performance of my test application that just increments a counter for one second drops by 4%. This is on a 16 cpu box with my test application using only one of them. An impact on a real server doing real work will be worse. Performance of KVM PMU drops nearly 50% due to jump_lable for "perf record" since KVM PMU implementation creates and destroys perf event frequently. This patch introduces a way to rate limit jump_label patching and uses it to fix the above problem. I believe that as jump_label use will spread the problem will become more common and thus solving it in a generic code is appropriate. Also fixing it in the perf code would result in moving jump_label accounting logic to perf code with all the ifdefs in case of JUMP_LABEL=n kernel. With this patch all details are nicely hidden inside jump_label code. Signed-off-by: Gleb Natapov <gleb@redhat.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111127155909.GO2557@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>