summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-06-20sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list namingIngo Molnar18-68/+66
So I've noticed a number of instances where it was not obvious from the code whether ->task_list was for a wait-queue head or a wait-queue entry. Furthermore, there's a number of wait-queue users where the lists are not for 'tasks' but other entities (poll tables, etc.), in which case the 'task_list' name is actively confusing. To clear this all up, name the wait-queue head and entry list structure fields unambiguously: struct wait_queue_head::task_list => ::head struct wait_queue_entry::task_list => ::entry For example, this code: rqw->wait.task_list.next != &wait->task_list ... is was pretty unclear (to me) what it's doing, while now it's written this way: rqw->wait.head.next != &wait->entry ... which makes it pretty clear that we are iterating a list until we see the head. Other examples are: list_for_each_entry_safe(pos, next, &x->task_list, task_list) { list_for_each_entry(wq, &fence->wait.task_list, task_list) { ... where it's unclear (to me) what we are iterating, and during review it's hard to tell whether it's trying to walk a wait-queue entry (which would be a bug), while now it's written as: list_for_each_entry_safe(pos, next, &x->head, entry) { list_for_each_entry(wq, &fence->wait.head, entry) { Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Move bit_wait_table[] and related functionality from ↵Ingo Molnar3-16/+26
sched/core.c to sched/wait_bit.c The key hashed waitqueue data structures and their initialization was done in the main scheduler file for no good reason, move them to sched/wait_bit.c instead. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into ↵Ingo Molnar11-511/+530
<linux/wait_bit.h> The wait_bit*() types and APIs are mixed into wait.h, but they are a pretty orthogonal extension of wait-queues. Furthermore, only about 50 kernel files use these APIs, while over 1000 use the regular wait-queue functionality. So clean up the main wait.h by moving the wait-bit functionality out of it, into a separate .h and .c file: include/linux/wait_bit.h for types and APIs kernel/sched/wait_bit.c for the implementation Update all header dependencies. This reduces the size of wait.h rather significantly, by about 30%. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h>Ingo Molnar1-323/+322
So there's over 300 CPP macro line-continuation backslashes in include/linux/wait.h (!!), which are aligned vertically to make the macro maze a bit more navigable. The recent renames and reorganization broke some of them, and instead of re-aligning them in every patch (which would add a lot of stylistic noise to the patches and make them less readable), I just ignored them - and fixed them up in a single go in this patch. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Improve the bit-wait API parameter names in the API function ↵Ingo Molnar1-15/+15
prototypes Contrary to kernel tradition, most of the bit-wait function prototypes in <linux/wait.h> don't fully define the parameter names, they only list the types: int out_of_line_wait_on_bit_timeout(void *, int, wait_bit_action_f *, unsigned, unsigned long); ... which is pretty passive-aggressive in terms of informing the reader about what these functions are doing. Fill in the parameter names, such as: int out_of_line_wait_on_bit_timeout(void *word, int, wait_bit_action_f *action, unsigned int mode, unsigned long timeout); Also turn spurious (and inconsistently utilized) cases of 'unsigned' into 'unsigned int'. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Standardize wait_bit_queue namingIngo Molnar2-25/+24
So wait-bit-queue head variables are often named: struct wait_bit_queue *q ... which is a bit ambiguous and super confusing, because they clearly suggest wait-queue head semantics and behavior (they rhyme with the old wait_queue_t *q naming), while they are extended wait-queue _entries_, not heads! They are misnomers in two ways: - the 'wait_bit_queue' leaves open the question of whether it's an entry or a head - the 'q' parameter and local variable naming falsely implies that it's a 'queue' - while it's an entry. This resulted in sometimes confusing cases such as: finish_wait(wq, &q->wait); where the 'q' is not a wait-queue head, but a wait-bit-queue entry. So improve this all by standardizing wait-bit-queue nomenclature similar to wait-queue head naming: struct wait_bit_queue => struct wait_bit_queue_entry q => wbq_entry Which makes it all a much clearer: struct wait_bit_queue_entry *wbq_entry ... and turns the former confusing piece of code into: finish_wait(wq_head, &wbq_entry->wq_entry; which IMHO makes it apparently clear what we are doing, without having to analyze the context of the code: we are adding a wait-queue entry to a regular wait-queue head, which entry is embedded in a wait-bit-queue entry. I'm not a big fan of acronyms, but repeating wait_bit_queue_entry in field and local variable names is too long, so Hopefully it's clear enough that 'wq_' prefixes stand for wait-queues, while 'wbq_' prefixes stand for wait-bit-queues. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Standardize 'struct wait_bit_queue' wait-queue entry field nameIngo Molnar6-36/+35
Rename 'struct wait_bit_queue::wait' to ::wq_entry, to more clearly name it as a wait-queue entry. Propagate it to a couple of usage sites where the wait-bit-queue internals are exposed. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Standardize internal naming of wait-queue headsIngo Molnar2-115/+115
The wait-queue head parameters and variables are named in a couple of ways, we have the following variants currently: wait_queue_head_t *q wait_queue_head_t *wq wait_queue_head_t *head In particular the 'wq' naming is ambiguous in the sense whether it's a wait-queue head or entry name - as entries were often named 'wait'. ( Not to mention the confusion of any readers coming over from workqueue-land. ) Standardize all this around a single, unambiguous parameter and variable name: struct wait_queue_head *wq_head which is easy to grep for and also rhymes nicely with the wait-queue entry naming: struct wait_queue_entry *wq_entry Also rename: struct __wait_queue_head => struct wait_queue_head ... and use this struct type to migrate from typedefs usage to 'struct' usage, which is more in line with existing kernel practices. Don't touch any external users and preserve the main wait_queue_head_t typedef. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Standardize internal naming of wait-queue entriesIngo Molnar2-91/+91
So the various wait-queue entry variables in include/linux/wait.h and kernel/sched/wait.c are named in a colorfully inconsistent way: wait_queue_entry_t *wait wait_queue_entry_t *__wait (even in plain C code!) wait_queue_entry_t *q (!) wait_queue_entry_t *new (making anyone who knows C++ cringe) wait_queue_entry_t *old I think part of the reason for the inconsistency is the constant apparent confusion about what a wait queue 'head' versus 'entry' is. ( Some of the documentation talks about a 'wait descriptor', which is the wait-queue entry itself - further adding to the confusion. ) The most common name is 'wait', but that in itself is somewhat ambiguous as well, as it does not really make it clear whether it's a wait-queue entry or head. To improve all this name the wait-queue entry structure parameters and variables consistently and push through this naming into all the wait.h and wait.c code: struct wait_queue_entry *wq_entry The 'wq_' prefix makes it easy to grep for, and we also use the opportunity to move away from the typedef to a plain 'struct' naming: in the kernel we typically reserve typedefs for cases where a C structure is really small and somewhat opaque - such as pte_t. wait-queue entries are neither small nor opaque, so use the more standard 'struct xxx_entry' list management code nomenclature instead. ( We don't touch external users, and we preserve the typedef as well for actual wait-queue users, to reduce unnecessary churn. ) Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Rename wait_queue_t => wait_queue_entry_tIngo Molnar94-213/+216
Rename: wait_queue_t => wait_queue_entry_t 'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue", but in reality it's a queue *entry*. The 'real' queue is the wait queue head, which had to carry the name. Start sorting this out by renaming it to 'wait_queue_entry_t'. This also allows the real structure name 'struct __wait_queue' to lose its double underscore and become 'struct wait_queue_entry', which is the more canonical nomenclature for such data types. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds10-8/+19
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "One build fix for an Amlogic clk driver and a handful of Allwinner clk driver fixes for some DT bindings and a randconfig build error that all came in this merge window" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM clk: meson: gxbb: fix build error without RESET_CONTROLLER clk: sunxi-ng: v3s: Fix usb otg device reset bit clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
2017-06-20Merge tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntbLinus Torvalds4-51/+15
Pull NTB fixes from Jon Mason: "NTB bug fixes to address the modinfo in ntb_perf, a couple of bugs in the NTB transport QP calculations, skx doorbells, and sleeping in ntb_async_tx_submit" * tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb: ntb: no sleep in ntb_async_tx_submit ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits ntb_transport: fix bug calculating num_qps_mw ntb_transport: fix qp count bug NTB: ntb_test: fix bug printing ntb_perf results ntb: Correct modinfo usage statement for ntb_perf
2017-06-19ntb: no sleep in ntb_async_tx_submitAllen Hubbe1-43/+7
Do not sleep in ntb_async_tx_submit, which could deadlock. This reverts commit "8c874cc140d667f84ae4642bb5b5e0d6396d2ca4" Fixes: 8c874cc140d6 ("NTB: Address out of DMA descriptor issue with NTB") Reported-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bitsDave Jiang1-1/+1
Fixing doorbell register length to 32bits per spec. On Skylake NTB, the doorbell registers are 32bit write only registers. The source for the doorbell is a 64bit register that shows the interrupt bits. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Fixes: 783dfa6cc41b ("ntb: Adding Skylake Xeon NTB support") Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19ntb_transport: fix bug calculating num_qps_mwLogan Gunthorpe1-2/+2
A divide by zero error occurs if qp_count is less than mw_count because num_qps_mw is calculated to be zero. The calculation appears to be incorrect. The requirement is for num_qps_mw to be set to qp_count / mw_count with any remainder divided among the earlier mws. For example, if mw_count is 5 and qp_count is 12 then mws 0 and 1 will have 3 qps per window and mws 2 through 4 will have 2 qps per window. Thus, when mw_num < qp_count % mw_count, num_qps_mw is 1 higher than when mw_num >= qp_count. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers") Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19ntb_transport: fix qp count bugLogan Gunthorpe1-2/+2
In cases where there are more mw's than spads/2-2, the mw count gets reduced to match the limitation. ntb_transport also tries to ensure that there are fewer qps than mws but uses the full mw count instead of the reduced one. When this happens, the math in 'ntb_transport_setup_qp_mw' will get confused and result in a kernel paging request bug. This patch fixes the bug by reducing qp_count to the reduced mw count instead of the full mw count. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Fixes: e26a5843f7f5 ("NTB: Split ntb_hw_intel and ntb_transport drivers") Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19NTB: ntb_test: fix bug printing ntb_perf resultsLogan Gunthorpe1-1/+1
The code mistakenly prints the local perf results for the remote test so the script reports identical results for both directions. Fix this by ensuring we print the remote result. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Fixes: a9c59ef77458 ("ntb_test: Add a selftest script for the NTB subsystem") Acked-by: Allen Hubbe <Allen.Hubbe@dell.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19ntb: Correct modinfo usage statement for ntb_perfGary R Hook1-2/+2
The order parameters are powers of 2; adjust the usage information to use correct mathematical representations. Signed-off-by: Gary R Hook <gary.hook@amd.com> Fixes: 8a7b6a778a85 ("ntb: ntb perf tool") Acked-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19Linux 4.12-rc6v4.12-rc6Linus Torvalds1-1/+1
2017-06-19mm: larger stack guard gap, between vmasHugh Dickins23-163/+152
Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-19Merge tag 'armsoc-fixes' of ↵Linus Torvalds5-13/+10
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Stream of fixes has slowed down, only a few this week: - Some DT fixes for Allwinner platforms, and addition of a clock to the R_CCU clock controller that had been missed. - A couple of small DT fixes for am335x-sl50" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0 ARM: dts: am335x-sl50: Fix card detect pin for mmc1 arm64: allwinner: h5: Remove syslink to shared DTSI ARM: sunxi: h3/h5: fix the compatible of R_CCU
2017-06-19Merge tag 'sunxi-fixes-for-4.12' of ↵Olof Johansson4-7/+8
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes Allwinner fixes for 4.12 A few fixes around the PRCM support that got in 4.12 with a wrong compatible, and a missing clock in the binding. * tag 'sunxi-fixes-for-4.12' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: allwinner: a64: Add PLL_PERIPH0 clock to the R_CCU ARM: sunxi: h3-h5: Add PLL_PERIPH0 clock to the R_CCU arm64: allwinner: h5: Remove syslink to shared DTSI ARM: sunxi: h3/h5: fix the compatible of R_CCU Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-19Merge tag 'omap-for-v4.12/fixes-sl50' of ↵Olof Johansson1-6/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Two fixes for am335x-sl50 to fix a boot time error for claiming SPI pins, and to fix a SDIO card detect pin for production version of the device. * tag 'omap-for-v4.12/fixes-sl50' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0 ARM: dts: am335x-sl50: Fix card detect pin for mmc1 Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-19Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-0/+7
Pull virtio bugfix from Michael Tsirkin: "It turns out balloon does not handle IOMMUs correctly. We should fix that at some point, for now let's just disable this configuration" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_balloon: disable VIOMMU support
2017-06-19Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Two driver bugfixes" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: ismt: fix wrong device address when unmap the data buffer i2c: rcar: use correct length when unmapping DMA
2017-06-19Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds8-31/+34
Pull MIPS fixes from Ralf Baechle: - Three highmem fixes: + Fixed mapping initialization + Adjust the pkmap location + Ensure we use at most one page for PTEs - Fix makefile dependencies for .its targets to depend on vmlinux - Fix reversed condition in BNEZC and JIALC software branch emulation - Only flush initialized flush_insn_slot to avoid NULL pointer dereference - perf: Remove incorrect odd/even counter handling for I6400 - ftrace: Fix init functions tracing * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: .its targets depend on vmlinux MIPS: Fix bnezc/jialc return address calculation MIPS: kprobes: flush_insn_slot should flush only if probe initialised MIPS: ftrace: fix init functions tracing MIPS: mm: adjust PKMAP location MIPS: highmem: ensure that we don't use more than one page for PTEs MIPS: mm: fixed mappings: correct initialisation MIPS: perf: Remove incorrect odd/even counter handling for I6400
2017-06-18virtio_balloon: disable VIOMMU supportMichael S. Tsirkin1-0/+7
virtio balloon bypasses the DMA API entirely so does not support the VIOMMU right now. It's not clear we need that support, for now let's just make sure we don't pretend to support it. Cc: stable@vger.kernel.org Cc: Wei Wang <wei.w.wang@intel.com> Fixes: 1a937693993f ("virtio: new feature to detect IOMMU device quirk") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2017-06-18Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds4-4/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Two fixlets for x86: - Handle WARN_ONs proper with the new UD based WARN implementation - Disable 1G mappings when 2M mappings are disabled by kmemleak or debug_pagealloc. Otherwise 1G mappings might still be used, confusing the debug mechanisms" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Disable 1GB direct mappings when disabling 2MB mappings x86/debug: Handle early WARN_ONs proper
2017-06-18Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds3-6/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Three fixlets for timers: - Two hot-fixes for the alarmtimer based posix timers, which prevent a nasty DOS by self rescheduling timers. The proper cleanup of that mess is queued for 4.13 - Make a function static" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/broadcast: Make tick_broadcast_setup_oneshot() static alarmtimer: Rate limit periodic intervals alarmtimer: Prevent overflow of relative timers
2017-06-18Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "Two small fixes for the schedulre core: - Use the proper switch_mm() variant in idle_task_exit() because that code is not called with interrupts disabled. - Fix a confusing typo in a printk" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off() sched/fair: Fix typo in printk message
2017-06-18Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds9-26/+46
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Three fixes for the perf user space side: - Fix the probing of precise_ip level, which got broken recently for x86. - Unbreak the ARCH=x86_64 build - Report module before trying to unwind into the module code, which avoids broken stack frames displayed" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf unwind: Report module before querying isactivation in dwfl unwind perf tools: Fix build with ARCH=x86_64 perf evsel: Fix probing of precise_ip level for default cycles event
2017-06-18Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "Add a missing resource release to an error path" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Release resources in __setup_irq() error path
2017-06-18Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Thomas Gleixner: "A single fix which adds fortify_panic to the list of no return functions" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Add fortify_panic as __noreturn function
2017-06-18Merge tag 'led_fixes_for_4.12-rc6' of ↵Linus Torvalds2-33/+2
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED fixes from Jacek Anaszewski: "Two LED fixes: - fix signal source assignment for leds-bcm6328 - revert patch that intended to fix LED behavior on suspend but it had a side effect preventing suspend at all due to uevent being sent on trigger removal" * tag 'led_fixes_for_4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: Revert "leds: handle suspend/resume in heartbeat trigger" leds: bcm6328: fix signal source assignment for leds 4 to 7
2017-06-18Merge tag 'usb-4.12-rc6' of ↵Linus Torvalds6-28/+24
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small gadget and xhci USB fixes for 4.12-rc6. Nothing major, but one of the gadget patches does fix a reported oops, and the xhci ones resolve reported problems. All have been in linux-next with no reported issues" * tag 'usb-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk usb: xhci: Fix USB 3.1 supported protocol parsing USB: gadget: fix GPF in gadgetfs usb: gadget: composite: make sure to reactivate function on unbind
2017-06-18Merge tag 'staging-4.12-rc6' of ↵Linus Torvalds8-13/+50
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO fixes from Greg KH: "Here are some small staging and IIO driver fixes for 4.12-rc6. Nothing huge, just a few small driver fixes for reported issues. All have been in linux-next with no reported issues" * tag 'staging-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: Staging: rtl8723bs: fix an error code in isFileReadable() iio: buffer-dmaengine: Add missing header buffer_impl.h iio: buffer-dma: Add missing header buffer_impl.h iio: adc: meson-saradc: fix potential crash in meson_sar_adc_clear_fifo iio: adc: mxs-lradc: Fix return value check in mxs_lradc_adc_probe() iio: imu: inv_mpu6050: add accel lpf setting for chip >= MPU6500 staging: iio: ad7152: Fix deadlock in ad7152_write_raw_samp_freq()
2017-06-18Merge tag 'ceph-for-4.12-rc6' of git://github.com/ceph/ceph-clientLinus Torvalds4-6/+8
Pull ceph fixes from Ilya Dryomov: "A fix for an old ceph ->fh_to_* bug from Luis and two timestamp fixups from Zheng, prompted by the ongoing y2038 work" * tag 'ceph-for-4.12-rc6' of git://github.com/ceph/ceph-client: ceph: unify inode i_ctime update ceph: use current_kernel_time() to get request time stamp ceph: check i_nlink while converting a file handle to dentry
2017-06-17Merge tag 'xfs-4.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2-4/+3
Pull xfs fix from Darrick Wong: "One more bugfix for you for 4.12-rc6 to fix something that came up in an earlier rc: - Fix some bogus ASSERT failures on CONFIG_SMP=n and CONFIG_XFS_DEBUG=y" * tag 'xfs-4.12-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix spurious spin_is_locked() assert failures on non-smp kernels
2017-06-17Merge branch 'ufs-fixes' of ↵Linus Torvalds6-68/+98
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ufs fixes from Al Viro: "Fix assorted ufs bugs: a couple of deadlocks, fs corruption in truncate(), oopsen on tail unpacking and truncate when racing with vmscan, mild fs corruption (free blocks stats summary buggered, *BSD fsck would complain and fix), several instances of broken logics around reserved blocks (starting with "check almost never triggers when it should" and then there are issues with sufficiently large UFS2)" [ Note: ufs hasn't gotten any loving in a long time, because nobody really seems to use it. These ufs fixes are triggered by people actually caring now, not some sudden influx of new bugs. - Linus ] * 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ufs_truncate_blocks(): fix the case when size is in the last direct block ufs: more deadlock prevention on tail unpacking ufs: avoid grabbing ->truncate_mutex if possible ufs_get_locked_page(): make sure we have buffer_heads ufs: fix s_size/s_dsize users ufs: fix reserved blocks check ufs: make ufs_freespace() return signed ufs: fix logics in "ufs: make fsck -f happy"
2017-06-17Merge branch 'for-linus' of ↵Linus Torvalds2-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A couple of fixes; a leak in mntns_install() caught by Andrei (this cycle regression) + d_invalidate() softlockup fix - that had been reported by a bunch of people lately, but the problem is pretty old" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: don't forget to put old mntns in mntns_install Hang/soft lockup in d_invalidate with simultaneous calls
2017-06-17Merge tag 'pci-v4.12-fixes-2' of ↵Linus Torvalds2-6/+7
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - fix another PCI_ENDPOINT build error (merged for v4.12) - fix error codes added to config accessors for v4.12 * tag 'pci-v4.12-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: endpoint: Select CRC32 to fix test build error PCI: Make error code types consistent in pci_{read,write}_config_*
2017-06-17Merge tag 'fbdev-v4.12-rc6' of git://github.com/bzolnier/linuxLinus Torvalds4-12/+12
Pull fbdev fixes from Bartlomiej Zolnierkiewicz: - fix udlfb driver to stop spamming logs (Mike Gerow) - add missing endianness conversions in smscufx & udlfb drivers (Johan Hovold) - fix few gcc warnings/errors (Arnd Bergmann) * tag 'fbdev-v4.12-rc6' of git://github.com/bzolnier/linux: video: fbdev: udlfb: drop log level for blanking video: fbdev: via: remove possibly unused variables video: fbdev: add missing USB-descriptor endianness conversions video: fbdev: avoid int-in-bool-context warning
2017-06-17Merge branch 'akpm' (patches from Andrew)Linus Torvalds5-13/+38
Merge misc fixes from Andrew Morton: "5 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: correct the comment when reclaimed pages exceed the scanned pages userfaultfd: shmem: handle coredumping in handle_userfault() mm: numa: avoid waiting on freed migrated pages swap: cond_resched in swap_cgroup_prepare() mm/memory-failure.c: use compound_head() flags for huge pages
2017-06-17mm: correct the comment when reclaimed pages exceed the scanned pageszhongjiang1-3/+3
Commit e1587a494540 ("mm: vmpressure: fix sending wrong events on underflow") declared that reclaimed pages exceed the scanned pages due to the thp reclaim. That is incorrect because THP will be spilt to normal page and loop again, which will result in the scanned pages increment. [akpm@linux-foundation.org: tweak comment text] Link: http://lkml.kernel.org/r/1496824266-25235-1-git-send-email-zhongjiang@huawei.com Signed-off-by: zhongjiang <zhongjiang@huawei.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-17userfaultfd: shmem: handle coredumping in handle_userfault()Andrea Arcangeli1-8/+21
Anon and hugetlbfs handle FOLL_DUMP set by get_dump_page() internally to __get_user_pages(). shmem as opposed has no special FOLL_DUMP handling there so handle_mm_fault() is invoked without mmap_sem and ends up calling handle_userfault() that isn't expecting to be invoked without mmap_sem held. This makes handle_userfault() fail immediately if invoked through shmem_vm_ops->fault during coredumping and solves the problem. The side effect is a BUG_ON with no lock held triggered by the coredumping process which exits. Only 4.11 is affected, pre-4.11 anon memory holes are skipped in __get_user_pages by checking FOLL_DUMP explicitly against empty pagetables (mm/gup.c:no_page_table()). It's zero cost as we already had a check for current->flags to prevent futex to trigger userfaults during exit (PF_EXITING). Link: http://lkml.kernel.org/r/20170615214838.27429-1-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: <stable@vger.kernel.org> [4.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-17mm: numa: avoid waiting on freed migrated pagesMark Rutland1-1/+7
In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by waiting until the pmd is unlocked before we return and retry. However, we can race with migrate_misplaced_transhuge_page(): // do_huge_pmd_numa_page // migrate_misplaced_transhuge_page() // Holds 0 refs on page // Holds 2 refs on page vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); /* ... */ if (pmd_trans_migrating(*vmf->pmd)) { page = pmd_page(*vmf->pmd); spin_unlock(vmf->ptl); ptl = pmd_lock(mm, pmd); if (page_count(page) != 2)) { /* roll back */ } /* ... */ mlock_migrate_page(new_page, page); /* ... */ spin_unlock(ptl); put_page(page); put_page(page); // page freed here wait_on_page_locked(page); goto out; } This can result in the freed page having its waiters flag set unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the page alloc/free functions. This has been observed on arm64 KVM guests. We can avoid this by having do_huge_pmd_numa_page() take a reference on the page before dropping the pmd lock, mirroring what we do in __migration_entry_wait(). When we hit the race, migrate_misplaced_transhuge_page() will see the reference and abort the migration, as it may do today in other cases. Fixes: b8916634b77bffb2 ("mm: Prevent parallel splits during THP migration") Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-17swap: cond_resched in swap_cgroup_prepare()Yu Zhao1-0/+3
I saw need_resched() warnings when swapping on large swapfile (TBs) because continuously allocating many pages in swap_cgroup_prepare() took too long. We already cond_resched when freeing page in swap_cgroup_swapoff(). Do the same for the page allocation. Link: http://lkml.kernel.org/r/20170604200109.17606-1-yuzhao@google.com Signed-off-by: Yu Zhao <yuzhao@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-17mm/memory-failure.c: use compound_head() flags for huge pagesJames Morse1-1/+4
memory_failure() chooses a recovery action function based on the page flags. For huge pages it uses the tail page flags which don't have anything interesting set, resulting in: > Memory failure: 0x9be3b4: Unknown page state > Memory failure: 0x9be3b4: recovery action for unknown page: Failed Instead, save a copy of the head page's flags if this is a huge page, this means if there are no relevant flags for this tail page, we use the head pages flags instead. This results in the me_huge_page() recovery action being called: > Memory failure: 0x9b7969: recovery action for huge page: Delayed For hugepages that have not yet been allocated, this allows the hugepage to be dequeued. Fixes: 524fca1e7356 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages") Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Tested-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-16Merge tag 'powerpc-4.12-6' of ↵Linus Torvalds5-10/+13
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Three small fixes for recently merged code: - remove a spurious WARN_ON when a PCI device has no of_node, it's allowed in some circumstances for there to be no of_node. - fix the offset for store EOI MMIOs in the XIVE interrupt controller. - fix non-const WARN_ONs which were becoming BUGs due to them losing BUGFLAG_WARNING in a recent cleanup patch. Thanks to: Alexey Kardashevskiy, Alistair Popple, Benjamin Herrenschmidt" * tag 'powerpc-4.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/debug: Add missing warn flag to WARN_ON's non-builtin path powerpc/xive: Fix offset for store EOI MMIOs powerpc/npu-dma: Remove spurious WARN_ON when a PCI device has no of_node
2017-06-16Merge tag 'perf-urgent-for-mingo-4.12-20170616' of ↵Ingo Molnar9-26/+46
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix probing of precise_ip level for default cycles event, that got broken recently on x86_64 when its arch code started considering invalid requesting precise samples when not sampling (i.e. when attr.sample_period == 0). This also fixes another problem in s/390 where the precision probing with sample_period == 0 returned precise_ip > 0, that then, when setting up the real cycles event (not probing) would return EOPNOTSUPP for precise_ip > 0 (as determined previously by probing) and sample_period > 0. These problems resulted in attr_precise not being set to the highest precision available on x86.64 when no event was specified, i.e. the canonical: perf record ./workload would end up using attr.precise_ip = 0. As a workaround this would need to be done: perf record -e cycles:P ./workload And on s/390 it would plain not work, requiring using: perf record -e cycles ./workload as a workaround. (Arnaldo Carvalho de Melo) - Fix perf build with ARCH=x86_64, when ARCH should be transformed into ARCH=x86, just like with the main kernel Makefile and tools/objtool's, i.e. use SRCARCH. (Jiada Wang) - Avoid accessing uninitialized data structures when unwinding with elfutils's libdw, making it more closely mimic libunwind's unwinder. (Milian Wolff) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>