summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-09-16Documentation/scheduler/sched-deadline.txt: Improve and clarify AC bitsLuca Abeni1-17/+82
Admission control is of key importance for SCHED_DEADLINE, since it guarantees system schedulability (or tells us something about the degree of guarantees we can provide to the user). This patch improves and clarifies bits and pieces regarding AC, both for UP and SMP systems. Signed-off-by: Luca Abeni <luca.abeni@unitn.it> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Reviewed-by: Henrik Austad <henrik@austad.us> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dario Faggioli <raistlin@linux.it> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1410256636-26171-4-git-send-email-juri.lelli@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-16Documentation/scheduler/sched-deadline.txt: Rewrite section 4 introJuri Lelli1-26/+25
Section 4 intro was still describing the old interface. Rewrite it. Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Luca Abeni <luca.abeni@unitn.it> Reviewed-by: Henrik Austad <henrik@austad.us> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dario Faggioli <raistlin@linux.it> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1410256636-26171-3-git-send-email-juri.lelli@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-16Documentation/scheduler/sched-deadline.txt: Fix terminology and improve clarityLuca Abeni1-18/+20
Several small changes regarding SCHED_DEADLINE documentation that fix terminology and improve clarity and readability: - "current runtime" becomes "remaining runtime" - readablity of an equation is improved by introducing more spacing - clarify when admission control will certainly fail - new URL for CBS technical report - substitue "smallest" with "earliest" Signed-off-by: Luca Abeni <luca.abeni@unitn.it> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Reviewed-by: Henrik Austad <henrik@austad.us> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dario Faggioli <raistlin@linux.it> Cc: Juri Lelli <juri.lelli@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1410256636-26171-2-git-send-email-juri.lelli@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-09sched: Reduce contention in update_cfs_rq_blocked_load()Jason Low1-0/+3
When running workloads on 2+ socket systems, based on perf profiles, the update_cfs_rq_blocked_load() function often shows up as taking up a noticeable % of run time. Much of the contention is in __update_cfs_rq_tg_load_contrib() when we update the tg load contribution stats. However, it turns out that in many cases, they don't need to be updated and "tg_contrib" is 0. This patch adds a check in __update_cfs_rq_tg_load_contrib() to skip updating tg load contribution stats when nothing needs to be updated. This reduces the cacheline contention that would be unnecessary. Reviewed-by: Ben Segall <bsegall@google.com> Reviewed-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: jason.low2@hp.com Cc: Yuyang Du <yuyang.du@intel.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Chegu Vinod <chegu_vinod@hp.com> Cc: Scott J Norton <scott.norton@hp.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1409643684.19197.15.camel@j-VirtualBox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-09sched: Migrate waking tasksLai Jiangshan1-1/+7
Current code can fail to migrate a waking task (silently) when TTWU_QUEUE is enabled. When a task is waking, it is pending on the wake_list of the rq, but it is not queued (task->on_rq == 0). In this case, set_cpus_allowed_ptr() and __migrate_task() will not migrate it because its invisible to them. This behavior is incorrect, because the task has been already woken, it will be running on the wrong CPU without correct placement until the next wake-up or update for cpus_allowed. To fix this problem, we need to finish the wakeup (so they appear on the runqueue) before we migrate them. Reported-by: Sasha Levin <sasha.levin@oracle.com> Reported-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Tested-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/538ED7EB.5050303@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-08sched, time: Atomically increment stime & utimeRik van Riel1-2/+5
The functions task_cputime_adjusted and thread_group_cputime_adjusted() can be called locklessly, as well as concurrently on many different CPUs. This can occasionally lead to the utime and stime reported by times(), and other syscalls like it, going backward. The cause for this appears to be multiple threads racing in cputime_adjust(), both with values for utime or stime that is larger than the original, but each with a different value. Sometimes the larger value gets saved first, only to be immediately overwritten with a smaller value by another thread. Using atomic exchange prevents that problem, and ensures time progresses monotonically. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: umgwanakikbuti@gmail.com Cc: fweisbec@gmail.com Cc: akpm@linux-foundation.org Cc: srao@redhat.com Cc: lwoodman@redhat.com Cc: atheurer@redhat.com Cc: oleg@redhat.com Link: http://lkml.kernel.org/r/1408133138-22048-4-git-send-email-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-08time, signal: Protect resource use statistics with seqlockRik van Riel6-29/+26
Both times() and clock_gettime(CLOCK_PROCESS_CPUTIME_ID) have scalability issues on large systems, due to both functions being serialized with a lock. The lock protects against reporting a wrong value, due to a thread in the task group exiting, its statistics reporting up to the signal struct, and that exited task's statistics being counted twice (or not at all). Protecting that with a lock results in times() and clock_gettime() being completely serialized on large systems. This can be fixed by using a seqlock around the events that gather and propagate statistics. As an additional benefit, the protection code can be moved into thread_group_cputime(), slightly simplifying the calling functions. In the case of posix_cpu_clock_get_task() things can be simplified a lot, because the calling function already ensures that the task sticks around, and the rest is now taken care of in thread_group_cputime(). This way the statistics reporting code can run lockless. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Daeseok Youn <daeseok.youn@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guillaume Morin <guillaume@morinfr.org> Cc: Ionut Alexa <ionut.m.alexa@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Michal Schmidt <mschmidt@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: umgwanakikbuti@gmail.com Cc: fweisbec@gmail.com Cc: srao@redhat.com Cc: lwoodman@redhat.com Cc: atheurer@redhat.com Link: http://lkml.kernel.org/r/20140816134010.26a9b572@annuminas.surriel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-08exit: Always reap resource stats in __exit_signal()Rik van Riel1-22/+21
Oleg pointed out that wait_task_zombie adds a task's usage statistics to the parent's signal struct, but the task's own signal struct should also propagate the statistics at exit time. This allows thread_group_cputime(reaped_zombie) to get the statistics after __unhash_process() has made the task invisible to for_each_thread, but before the thread has actually been rcu freed, making sure no non-monotonic results are returned inside that window. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Guillaume Morin <guillaume@morinfr.org> Cc: Ionut Alexa <ionut.m.alexa@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Michal Schmidt <mschmidt@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: umgwanakikbuti@gmail.com Cc: fweisbec@gmail.com Cc: srao@redhat.com Cc: lwoodman@redhat.com Cc: atheurer@redhat.com Link: http://lkml.kernel.org/r/1408133138-22048-2-git-send-email-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-08Merge tag 'v3.17-rc4' into sched/core, to prevent conflicts with upcoming ↵Ingo Molnar9224-441113/+375888
patches, and to refresh the tree Linux 3.17-rc4
2014-09-08Linux 3.17-rc4v3.17-rc4Linus Torvalds1-1/+1
2014-09-08Documentation: new page link in SubmittingPatchesSudip Mukherjee1-0/+1
new link for - How to piss off a Linux kernel subsystem maintainer Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-08Documentation: NFS/RDMA: Document separate Kconfig symbolsPaul Bolle1-7/+9
The NFS/RDMA Kconfig symbol was split into separate options for client and server in commit 2e8c12e1b765 ("xprtrdma: add separate Kconfig options for NFSoRDMA client and server support"). Update the documentation to reflect this split. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "J. Bruce Fields" <bfields@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-08Documentation: misc-devices: Rename freefall.c from hpfall.c in lis2lv02dMasanari Iida1-1/+1
hpfall.c was renamed to freefall.c in 3.16, but this file still refer to hpfall.c instead of freefall.c Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-08Documentation: i2c: rename variable "register" to "reg"Jose Manuel Alarcon Roldan1-5/+5
The example code provided with the i2c device interface documentation won't compile since it uses the reserved word "register" to name a variable. The compiler fails with this error message: error: expected identifier or '(' before '=' token __u8 register = 0x20; /* Device register to access */ ^ Rename the variable "register" to simply "reg" in the example code. Another couple of typos has been fixed as well. [Change "! =" to "!=".] Signed-off-by: Jose Alarcon Roldan <jose.alarcon.roldan@gmail.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-08Documentation: seq_file: Document seq_open_private(), seq_release_private()Rob Jones1-0/+33
Despite the fact that these functions have been around for years, they are little used (only 15 uses in 13 files at the preseht time) even though many other files use work-arounds to achieve the same result. By documenting them, hopefully they will become more widely used. Signed-off-by: Rob Jones <rob.jones@codethink.co.uk> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-07Merge tag 'pm+acpi-3.17-rc4' of ↵Linus Torvalds12-34/+122
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management fixes from Rafael Wysocki: "These are regression fixes (ACPI sysfs, ACPI video, suspend test), ACPI cpuidle deadlock fix, missing runtime validation of ACPI _DSD output, a fix and a new CPU ID for the RAPL driver, new blacklist entry for the ACPI EC driver and a couple of trivial cleanups (intel_pstate and generic PM domains). Specifics: - Fix for recently broken test_suspend= command line argument (Rafael Wysocki). - Fixes for regressions related to the ACPI video driver caused by switching the default to native backlight handling in 3.16 from Hans de Goede. - Fix for a sysfs attribute of ACPI device objects that returns stale values sometimes due to the fact that they are cached instead of executing the appropriate method (_SUN) every time (broken in 3.14). From Yasuaki Ishimatsu. - Fix for a deadlock between cpuidle_lock and cpu_hotplug.lock in the ACPI processor driver from Jiri Kosina. - Runtime output validation for the ACPI _DSD device configuration object missing from the support for it that has been introduced recently. From Mika Westerberg. - Fix for an unuseful and misleading RAPL (Running Average Power Limit) domain detection message in the RAPL driver from Jacob Pan. - New Intel Haswell CPU ID for the RAPL driver from Jason Baron. - New Clevo W350etq blacklist entry for the ACPI EC driver from Lan Tianyu. - Cleanup for the intel_pstate driver and the core generic PM domains code from Gabriele Mazzotta and Geert Uytterhoeven" * tag 'pm+acpi-3.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock ACPI / scan: not cache _SUN value in struct acpi_device_pnp cpufreq: intel_pstate: Remove unneeded variable powercap / RAPL: change domain detection message powercap / RAPL: add support for CPU model 0x3f PM / domains: Make generic_pm_domain.name const PM / sleep: Fix test_suspend= command line option ACPI / EC: Add msi quirk for Clevo W350etq ACPI / video: Disable native_backlight on HP ENVY 15 Notebook PC ACPI / video: Add a disable_native_backlight quirk ACPI / video: Fix use_native_backlight selection logic ACPICA: ACPI 5.1: Add support for runtime validation of _DSD package.
2014-09-07Merge branch 'for-linus' of ↵Linus Torvalds4-14/+18
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull filesystem fixes from Al Viro: "Several bugfixes (all of them -stable fodder). Alexey's one deals with double mutex_lock() in UFS (apparently, nobody has tried to test "ufs: sb mutex merge + mutex_destroy" on something like file creation/removal on ufs). Mine deal with two kinds of umount bugs, in umount propagation and in handling of automounted submounts, both resulting in bogus transient EBUSY from umount" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ufs: fix deadlocks introduced by sb mutex merge fix EBUSY on umount() from MNT_SHRINKABLE get rid of propagate_umount() mistakenly treating slaves as busy.
2014-09-07Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds2-12/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fix from Ingo Molnar: "A boot hang fix for the offloaded callback RCU model (RCU_NOCB_CPU=y && (TREE_CPU=y || TREE_PREEMPT_RC)) in certain bootup scenarios" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Make nocb leader kthreads process pending callbacks after spawning
2014-09-07Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds4-11/+39
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Three fixlets from the timer departement: - Update the timekeeper before updating vsyscall and pvclock. This fixes the kvm-clock regression reported by Chris and Paolo. - Use the proper irq work interface from NMI. This fixes the regression reported by Catalin and Dave. - Clarify the compat_nanosleep error handling mechanism to avoid future confusion" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Update timekeeper before updating vsyscall and pvclock compat: nanosleep: Clarify error handling nohz: Restore NMI safe local irq work for local nohz kick
2014-09-07ufs: fix deadlocks introduced by sb mutex mergeAlexey Khoroshilov2-13/+8
Commit 0244756edc4b ("ufs: sb mutex merge + mutex_destroy") introduces deadlocks in ufs_new_inode() and ufs_free_inode(). Most callers of that functions acqure the mutex by themselves and ufs_{new,free}_inode() do that via lock_ufs(), i.e we have an unavoidable double lock. The patch proposes to resolve the issue by making sure that ufs_{new,free}_inode() are not called with the mutex held. Found by Linux Driver Verification project (linuxtesting.org). Cc: stable@vger.kernel.org # 3.16 Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-07sched/deadline: Fix a precision problem in the microseconds rangexiaofeng.yan3-13/+10
An overrun could happen in function start_hrtick_dl() when a task with SCHED_DEADLINE runs in the microseconds range. For example, if a task with SCHED_DEADLINE has the following parameters: Task runtime deadline period P1 200us 500us 500us The deadline and period from task P1 are less than 1ms. In order to achieve microsecond precision, we need to enable HRTICK feature by the next command: PC#echo "HRTICK" > /sys/kernel/debug/sched_features PC#trace-cmd record -e sched_switch & PC#./schedtool -E -t 200000:500000:500000 -e ./test The binary test is in an endless while(1) loop here. Some pieces of trace.dat are as follows: <idle>-0 157.603157: sched_switch: :R ==> 2481:4294967295: test test-2481 157.603203: sched_switch: 2481:R ==> 0:120: swapper/2 <idle>-0 157.605657: sched_switch: :R ==> 2481:4294967295: test test-2481 157.608183: sched_switch: 2481:R ==> 2483:120: trace-cmd trace-cmd-2483 157.609656: sched_switch:2483:R==>2481:4294967295: test We can get the runtime of P1 from the information above: runtime = 157.608183 - 157.605657 runtime = 0.002526(2.526ms) The correct runtime should be less than or equal to 200us at some point. The problem is caused by a conditional judgment "delta > 10000" in function start_hrtick_dl(). Because no hrtimer start up to control the rest of runtime when the reset of runtime is less than 10us. So the process will continue to run until tick-period is coming. Move the code with the limit of the least time slice from hrtick_start_fair() to hrtick_start() because the EDF schedule class also needs this function in start_hrtick_dl(). To fix this problem, we call hrtimer_start() unconditionally in start_hrtick_dl(), and make sure the scheduling slice won't be smaller than 10us in hrtimer_start(). Signed-off-by: Xiaofeng Yan <xiaofeng.yan@huawei.com> Reviewed-by: Li Zefan <lizefan@huawei.com> Acked-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1409022941-5880-1-git-send-email-xiaofeng.yan@huawei.com [ Massaged the changelog and the code. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds8-17/+28
Pull kvm fixes from Paolo Bonzini: "A smattering of bug fixes across most architectures" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: powerpc/kvm/cma: Fix panic introduces by signed shift operation KVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags KVM: s390/mm: Fix storage key corruption during swapping arm/arm64: KVM: Complete WFI/WFE instructions ARM/ARM64: KVM: Nuke Hyp-mode tlbs before enabling MMU KVM: s390/mm: try a cow on read only pages for key ops KVM: s390: Fix user triggerable bug in dead code
2014-09-06Merge tag 'fixes-for-linus' of ↵Linus Torvalds12-59/+90
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Kevin Hilman: "Another round of fixes from arm-soc land, which are mostly DT fixes for: - OMAP: handful of DT fixes devices on newly supported hardware - davinci: fix 2nd EDMA channel - ux500: extend previous pinctrl fix to another board - at91: clock registration fixes, compatibility string precision And one more fix for event cleanup in drivers/bus/arm-ccn" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: bus: arm-ccn: Move event cleanup routine ARM: at91/dt: rm9200: fix usb clock definition ARM: at91: rm9200: fix clock registration ARM: at91/dt: sam9g20: set at91sam9g20 pllb driver ARM: dts: dra7-evm: Add vtt regulator support ARM: dts: dra7-evm: Fix spi1 mux documentation ARM: dts: am43x-epos-evm: Disable QSPI to prevent conflict with GPMC-NAND ARM: OMAP2+: gpmc: Don't complain if wait pin is used without r/w monitoring ARM: dts: am43xx-epos-evm: Don't use read/write wait monitoring ARM: dts: am437x-gp-evm: Don't use read/write wait monitoring ARM: dts: am437x-gp-evm: Use BCH16 ECC scheme instead of BCH8 ARM: dts: am43x-epos-evm: Use BCH16 ECC scheme instead of BCH8 ARM: dts: am4372: fix USB regs size ARM: dts: am437x-gp: switch i2c0 to 100KHz ARM: dts: dra7-evm: Fix 8th NAND partition's name ARM: dts: dra7-evm: Fix i2c3 pinmux and frequency ARM: ux500: disable msp2 node on Snowball ARM: edma: Fix configuration parsing for SoCs with multiple eDMA3 CC ARM: dts: set 'ti,set-rate-parent' for dpll4_m5x2 clock
2014-09-06Merge tag 'xfs-for-linus-3.17-rc3' of ↵Linus Torvalds4-12/+114
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs Pull xfs fixes from Dave Chinner: "The fixes all address recently discovered data corruption issues. The original Direct IO issue was discovered by Chris Mason @ Facebook on a production workload which mixed buffered reads with direct reads and writes IO to the same file. The fix for that exposed other issues with page invalidation (exposed by millions of fsx operations) failing due to dirty buffers beyond EOF. Finally, the collapse_range code could also cause problems due to racing writeback changing the extent map while it was being shifted around. The commits for that problem are simple mitigation fixes that prevent the problem from occuring. A more robust fix for 3.18 that addresses the underlying problem is currently being worked on by Brian. Summary of fixes: - a direct IO read/buffered read data corruption - the associated fallout from the DIO data corruption fix - collapse range bugs that are potential data corruption issues" * tag 'xfs-for-linus-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: xfs: trim eofblocks before collapse range xfs: xfs_file_collapse_range is delalloc challenged xfs: don't log inode unless extent shift makes extent modifications xfs: use ranged writeback and invalidation for direct IO xfs: don't zero partial page cache pages during O_DIRECT writes xfs: don't zero partial page cache pages during O_DIRECT writes xfs: don't dirty buffers beyond EOF
2014-09-06Merge tag 'for-linus-20140905' of git://git.infradead.org/linux-mtdLinus Torvalds2-1/+5
Pull mtd fixes from Brian Norris: "Two trivial MTD updates for 3.17-rc4: - a tiny comment tweak, to kill a bunch of DocBook warnings added during the merge window - a small fixup to the OTP routines' error handling" * tag 'for-linus-20140905' of git://git.infradead.org/linux-mtd: mtd: nand: fix DocBook warnings on nand_sdr_timings doc mtd: cfi_cmdset_0002: check return code for get_chip()
2014-09-06timekeeping: Update timekeeper before updating vsyscall and pvclockThomas Gleixner1-2/+3
The update_walltime() code works on the shadow timekeeper to make the seqcount protected region as short as possible. But that update to the shadow timekeeper does not update all timekeeper fields because it's sufficient to do that once before it becomes life. One of these fields is tkr.base_mono. That stays stale in the shadow timekeeper unless an operation happens which copies the real timekeeper to the shadow. The update function is called after the update calls to vsyscall and pvclock. While not correct, it did not cause any problems because none of the invoked update functions used base_mono. commit cbcf2dd3b3d4 (x86: kvm: Make kvm_get_time_and_clockread() nanoseconds based) changed that in the kvm pvclock update function, so the stale mono_base value got used and caused kvm-clock to malfunction. Put the update where it belongs and fix the issue. Reported-by: Chris J Arges <chris.j.arges@canonical.com> Reported-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409050000570.3333@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-09-06compat: nanosleep: Clarify error handlingThomas Gleixner1-3/+21
The error handling in compat_sys_nanosleep() is correct, but completely non obvious. Document it and restrict it to the -ERESTART_RESTARTBLOCK return value for clarity. Reported-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-09-06Merge branch 'i2c/for-current' of ↵Linus Torvalds4-14/+62
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c bugfixes from Wolfram Sang: "I2C driver bugfixes for the 3.17 release. Details can be found in the commit messages, yet I think this is typical driver stuff" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: Revert "i2c: rcar: remove spinlock" i2c: at91: add bound checking on SMBus block length bytes i2c: rk3x: fix bug that cause transfer fails in master receive mode i2c: at91: Fix a race condition during signal handling in at91_do_twi_xfer. i2c: mv64xxx: continue probe when clock-frequency is missing i2c: rcar: fix MNR interrupt handling
2014-09-06Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixesKevin Hilman3-2/+12
Merge "at91: fixes for 3.17 #1" from Nicols Ferre: First AT91 fixes batch for 3.17: - compatibility string precision - clock registration and USB DT fix for at91rm9200 * tag 'at91-fixes' of git://github.com/at91linux/linux-at91: ARM: at91/dt: rm9200: fix usb clock definition ARM: at91: rm9200: fix clock registration ARM: at91/dt: sam9g20: set at91sam9g20 pllb driver Signed-off-by: Kevin Hilman <khilman@linaro.org>
2014-09-06bus: arm-ccn: Move event cleanup routinePawel Moll1-26/+25
The function cleaning up an initialized event was called from the "event_del" handler, instead of being used as the "destroy" callback. In case of events group allocation this caused NULL pointer dereference (as events are added and deleted multiple times then). Fixed now. Signed-off-by: Pawel Moll <mail@pawelmoll.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>
2014-09-05Merge branch 'for-linus' of ↵Linus Torvalds3-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: "Wire up new syscalls getrandom and memfd_create" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Wire up memfd_create m68k: Wire up getrandom
2014-09-05ARM: at91/dt: rm9200: fix usb clock definitionAlexandre Belloni1-1/+1
The atmel,clk-divisors property is taking 4 divisors, if less are provided, the clock registration will fail. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2014-09-05ARM: at91: rm9200: fix clock registrationAlexandre Belloni1-1/+10
Actually register clocks from device tree when using the common clock framework. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> [nicolas.ferre@atmel.com: add at91 to function name] Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2014-09-05ARM: at91/dt: sam9g20: set at91sam9g20 pllb driverGaël PORTAY1-0/+1
The at91sam9g20 SOC uses its own pllb implementation which is different from the one inherited from at91sam9260 SOC. Signed-off-by: Gaël PORTAY <gael.portay@gmail.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2014-09-05mm: memcontrol: revert use of root_mem_cgroup res_counterJohannes Weiner1-25/+78
Dave Hansen reports a massive scalability regression in an uncontained page fault benchmark with more than 30 concurrent threads, which he bisected down to 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") and pin-pointed on res_counter spinlock contention. That change relied on the per-cpu charge caches to mostly swallow the res_counter costs, but it's apparent that the caches don't scale yet. Revert memcg back to bypassing res_counters on the root level in order to restore performance for uncontained workloads. Reported-by: Dave Hansen <dave@sr71.net> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-05Export sync_filesystem() for modular ->remount_fs() useAnton Altaparmakov1-1/+1
This patch changes sync_filesystem() to be EXPORT_SYMBOL(). The reason this is needed is that starting with 3.15 kernel, due to Theodore Ts'o's commit 02b9984d6408 ("fs: push sync_filesystem() down to the file system's remount_fs()"), all file systems that have dirty data to be written out need to call sync_filesystem() from their ->remount_fs() method when remounting read-only. As this is now a generically required function rather than an internal only function it should be EXPORT_SYMBOL() so that all file systems can call it. Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-05Merge tag 'regulator-v3.17-rc3' of ↵Linus Torvalds8-17/+21
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator documentation fixes from Mark Brown: "All the fixes people have found for the regulator API have been documentation fixes, avoiding warnings while building the kerneldoc, fixing some errors in one of the DT bindings documents and fixing some typos in the header" * tag 'regulator-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: fix kernel-doc warnings in header files regulator: Proofread documentation regulator: tps65090: Fix tps65090 typos in example
2014-09-05Merge tag 'omap-fixes-against-v3.17-rc3' of ↵Kevin Hilman317-1293/+2620
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Merge "omap fixes against v3.17-rc3" from Tony Lindgren: Few fixes for omaps mostly for various devices to get them working properly on the new am437x and dra7 hardware for several devices such as I2C, NAND, DDR3 and USB. There's also a clock fix for omap3. And also included are two minor cosmetic fixes that are not stictly fixes for the new hardware support added recently to downgrade a GPMC warning into a debug statement, and fix the confusing comments for dra7-evm spi1 mux. Note that these are all .dts changes except for a GPMC change. * tag 'omap-fixes-against-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (255 commits) ARM: dts: dra7-evm: Add vtt regulator support ARM: dts: dra7-evm: Fix spi1 mux documentation ARM: dts: am43x-epos-evm: Disable QSPI to prevent conflict with GPMC-NAND ARM: OMAP2+: gpmc: Don't complain if wait pin is used without r/w monitoring ARM: dts: am43xx-epos-evm: Don't use read/write wait monitoring ARM: dts: am437x-gp-evm: Don't use read/write wait monitoring ARM: dts: am437x-gp-evm: Use BCH16 ECC scheme instead of BCH8 ARM: dts: am43x-epos-evm: Use BCH16 ECC scheme instead of BCH8 ARM: dts: am4372: fix USB regs size ARM: dts: am437x-gp: switch i2c0 to 100KHz ARM: dts: dra7-evm: Fix 8th NAND partition's name ARM: dts: dra7-evm: Fix i2c3 pinmux and frequency Linux 3.17-rc3 ... Signed-off-by: Kevin Hilman <khilman@linaro.org>
2014-09-05Merge tag 'gpio-v3.17-3' of ↵Linus Torvalds3-49/+83
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: - some documentation sync - resource leak in the bt8xx driver - again fix the way varargs are used to handle the optional flags on the gpiod_* accessors. Now hopefully nailed the entire problem. * tag 'gpio-v3.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: move varargs hack outside #ifdef GPIOLIB gpio: bt8xx: fix release of managed resources Documentation: gpio: documentation for optional getters functions
2014-09-05Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds10-41/+90
Pull drm fixes from Dave Airlie: - i915 fixes: a few display regressions - vmwgfx: possible loop forever fix - nouveau: one userspace interface fix * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/nouveau/core: don't leak oclass type bits to user drm/i915: Fix lock dropping in intel_tv_detect() drm/i915: handle G45/GM45 pulse detection connected state. drm/vmwgfx: Fix a potential infinite spin waiting for fifo idle drm/vmwgfx: Fix an incorrect OOM return value drm/i915: Remove bogus __init annotation from DMI callbacks drm/i915: don't warn if backlight unexpectedly enabled drm/i915: Move intel_ddi_set_vc_payload_alloc(false) to haswell_crtc_disable() drm/i915: fix plane/cursor handling when runtime suspended drm/i915: Ignore VBT backlight presence check on Acer C720 (4005U)
2014-09-05Merge branches 'pm-sleep', 'powercap', 'pm-domains' and 'pm-cpufreq'Rafael J. Wysocki6-22/+26
* pm-sleep: PM / sleep: Fix test_suspend= command line option * powercap: powercap / RAPL: change domain detection message powercap / RAPL: add support for CPU model 0x3f * pm-domains: PM / domains: Make generic_pm_domain.name const * pm-cpufreq: cpufreq: intel_pstate: Remove unneeded variable
2014-09-05Merge branches 'acpi-video' and 'acpi-ec'Rafael J. Wysocki2-2/+47
* acpi-video: ACPI / video: Disable native_backlight on HP ENVY 15 Notebook PC ACPI / video: Add a disable_native_backlight quirk ACPI / video: Fix use_native_backlight selection logic * acpi-ec: ACPI / EC: Add msi quirk for Clevo W350etq
2014-09-05Merge branches 'acpica', 'acpi-processor' and 'acpi-scan'Rafael J. Wysocki3-10/+10
* acpica: ACPICA: ACPI 5.1: Add support for runtime validation of _DSD package. * acpi-processor: ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock * acpi-scan: ACPI / scan: not cache _SUN value in struct acpi_device_pnp
2014-09-05sched/wait: Document timeout corner caseScot Doyle1-6/+10
The timeout may elapse without 0 being returned, such as when waiting on an unused queue. Document this possibility. Signed-off-by: Scot Doyle <lkml14@scotdoyle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LNX.2.11.1408241710070.6462@localhost.localdomain Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-05Merge remote-tracking branches 'regulator/fix/doc' and ↵Mark Brown8-17/+21
'regulator/fix/tps65090' into regulator-linus
2014-09-05sched/fair: Replace rcu_assign_pointer() with RCU_INIT_POINTER()Andreea-Cristina Bernat1-1/+1
The use of "rcu_assign_pointer()" is NULLing out the pointer. According to RCU_INIT_POINTER()'s block comment: "1. This use of RCU_INIT_POINTER() is NULLing out the pointer" it is better to use it instead of rcu_assign_pointer() because it has a smaller overhead. The following Coccinelle semantic patch was used: @@ @@ - rcu_assign_pointer + RCU_INIT_POINTER (..., NULL) Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: paulmck@linux.vnet.ibm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140822145043.GA580@ada Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-05Merge branch 'linux-3.17' of ↵Dave Airlie1-2/+2
git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes single fix for nouveau. * 'linux-3.17' of git://anongit.freedesktop.org/git/nouveau/linux-2.6: drm/nouveau/core: don't leak oclass type bits to user
2014-09-05drm/nouveau/core: don't leak oclass type bits to userBen Skeggs1-2/+2
Fixes not being able to init fence subsystem when multiple boards are present. Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2014-09-05Merge git://git.kvack.org/~bcrl/aio-fixesLinus Torvalds1-1/+12
Pull aio bugfixes from Ben LaHaise: "Two small fixes" * git://git.kvack.org/~bcrl/aio-fixes: aio: block exit_aio() until all context requests are completed aio: add missing smp_rmb() in read_events_ring
2014-09-05aio: block exit_aio() until all context requests are completedGu Zheng1-1/+6
It seems that exit_aio() also needs to wait for all iocbs to complete (like io_destroy), but we missed the wait step in current implemention, so fix it in the same way as we did in io_destroy. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Cc: stable@vger.kernel.org