summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-10-03ceph: remove warning when ceph_releasepage() is called on dirty pageNeilBrown1-3/+2
If O_DIRECT writes are racing with buffered writes, then the call to invalidate_inode_pages2_range() can call ceph_releasepage() on dirty pages. Most filesystems hold inode_lock() across O_DIRECT writes so they do not suffer this race, but cephfs deliberately drops the lock, and opens a window for the race. This race can be triggered with the generic/036 test from the xfstests test suite. It doesn't happen every time, but it does happen often. As the possibilty is expected, remove the warning, and instead include the PageDirty() status in the debug message. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Yan, Zheng <zyan@redhat.com>
2016-10-03ceph: ignore error from invalidate_inode_pages2_range() in direct writeNeilBrown1-2/+2
This call can fail if there are dirty pages. The preceding call to filemap_write_and_wait_range() will normally remove dirty pages, but as inode_lock() is not held over calls to ceph_direct_read_write(), it could race with non-direct writes and pages could be dirtied immediately after filemap_write_and_wait_range() returns If there are dirty pages, they will be removed by the subsequent call to truncate_inode_pages_range(), so having them here is not a problem. If the 'ret' value is left holding an error, then in the async IO case (aio_req is not NULL) the loop that would normally call ceph_osdc_start_request() will see the error in 'ret' and abort all requests. This doesn't seem like correct behaviour. So use separate 'ret2' instead of overloading 'ret'. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Yan, Zheng <zyan@redhat.com>
2016-10-03ceph: fix error handling of start_read()Yan, Zheng1-10/+9
If start_page() fails to add a page to page cache or fails to send OSD request. It should cal put_page() (instead of free_page()) for relevant pages. Besides, start_page() need to cancel fscache readpage if it fails to send OSD request. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reported-by: Zhi Zhang <zhang.david2011@gmail.com>
2016-10-03rbd: add rbd_obj_request_error() helperIlya Dryomov1-10/+18
Pull setting an error and marking a request done code into a new helper. obj_request_img_data_test() check isn't strictly needed right now, but makes it applicable to !img_data requests and a bit safer. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-10-03rbd: img_data requests don't own their page arrayIlya Dryomov1-8/+3
Move the check into rbd_obj_request_destroy() to avoid use-after-free on errors in rbd_img_request_fill(..., OBJ_REQUEST_PAGES, ...), where pages, owned by the caller, gets freed in rbd_img_request_fill(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: David Disseldorp <ddiss@suse.de>
2016-10-03rbd: don't call rbd_osd_req_format_read() for !img_data requestsIlya Dryomov1-7/+2
Accessing obj_request->img_request union field is only valid for object requests associated with an image (i.e. if obj_request_img_data_test() returns true). rbd_osd_req_format_read() used to do more, but now it just sets osd_req->snap_id. Standalone and stat object requests always go to the HEAD revision and are fine with CEPH_NOSNAP set by libceph, so get around the invalid union field use by simply not calling rbd_osd_req_format_read() in those places. Reported-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: David Disseldorp <ddiss@suse.de>
2016-10-03rbd: rework rbd_img_obj_exists_submit() error pathsIlya Dryomov1-20/+22
- don't put obj_request before rbd_obj_request_get() if rbd_obj_request_create() fails - don't leak pages if rbd_obj_request_create() fails - don't leak stat_request if rbd_osd_req_create() fails Reported-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: David Disseldorp <ddiss@suse.de>
2016-10-03rbd: don't crash or leak on errors in rbd_img_obj_parent_read_full_callback()Ilya Dryomov1-1/+2
- fix parent_length == img_request->xferred assert to not fire on copyup read failures - don't leak pages if copyup read fails or we can't allocate a new osd request Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: David Disseldorp <ddiss@suse.de>
2016-10-03rbd: move bumping img_request refcount into rbd_obj_request_submit()Ilya Dryomov1-4/+8
Commit 0f2d5be792b0 ("rbd: use reference counts for image requests") added rbd_img_request_get(), which rbd_img_request_fill() calls for each obj_request added to img_request. It was an urgent band-aid for the uglyness that is rbd_img_obj_callback() and none of the error paths were updated. Given that this img_request reference is meant to represent an obj_request that hasn't passed through rbd_img_obj_callback() yet, proper cleanup in appropriate destructors is a challenge. However, noting that if we don't get a chance to call rbd_obj_request_complete(), there is not going to be a call to rbd_img_obj_callback(), we can move rbd_img_request_get() into rbd_obj_request_submit() and fixup the two places that call rbd_obj_request_complete() directly and not through rbd_obj_request_submit() to temporarily bump img_request, so that rbd_img_obj_callback() can put as usual. This takes care of img_request leaks on errors on the submit side. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
2016-10-03rbd: mark the original request as done if stat request failsIlya Dryomov1-13/+15
If stat request fails with something other than -ENOENT (which just means that we need to copyup), the original object request is never marked as done and therefore never completed. Fix this by moving the mark done + complete snippet from rbd_img_obj_parent_read_full() into rbd_img_obj_exists_callback(). The former remains covered, as the latter is its only caller (through rbd_img_obj_request_submit()). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: David Disseldorp <ddiss@suse.de>
2016-10-03rbd: clean up asserts in rbd_img_obj_request_submit() helpersIlya Dryomov1-20/+10
Assert once in rbd_img_obj_request_submit(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: David Disseldorp <ddiss@suse.de>
2016-10-03rbd: change rbd_obj_request_submit() signatureIlya Dryomov1-47/+23
- osdc parameter is useless - starting with commit 5aea3dcd5021 ("libceph: a major OSD client update"), ceph_osdc_start_request() always returns success Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: David Disseldorp <ddiss@suse.de>
2016-10-03rbd: lock_on_read map optionIlya Dryomov1-1/+12
Add a per-device option to acquire exclusive lock on reads (in addition to writes and discards). The use case is iSCSI, where it will be used to prevent execution of stale writes after the implicit failover. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Mike Christie <mchristi@redhat.com>
2016-08-25rbd: add force close optionMike Christie2-12/+33
This adds a force close option, so we can force the unmapping of a rbd device that is open. If a path/device is blacklisted, apps like multipathd can map a new device and then unmap the old one. The unmapping cleanup would then be handled by the generic hotunplug code paths in multipahd like is done for iSCSI, FC/FCOE, SAS, etc. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-08-25rbd: add 'config_info' sysfs rbd device attributeMike Christie2-2/+26
Export the info used to setup the rbd image, so it can be used to remap the image. Signed-off-by: Mike Christie <mchristi@redhat.com> [idryomov@gmail.com: do_rbd_add() EH] Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-08-25rbd: add 'snap_id' sysfs rbd device attributeMike Christie2-0/+14
Export snap id in sysfs, so tools like multipathd can use it in a uuid. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-08-25rbd: add 'cluster_fsid' sysfs rbd device attributeMike Christie2-0/+14
Export the cluster fsid, so tools like udev and multipath-tools can use it for part of the uuid. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-08-25rbd: add 'client_addr' sysfs rbd device attributeIlya Dryomov4-0/+26
Export client addr/nonce, so userspace can check if a image is being blacklisted. Signed-off-by: Mike Christie <mchristi@redhat.com> [idryomov@gmail.com: ceph_client_addr(), endianess fix] Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-08-25rbd: print capacity in decimal and features in hexIlya Dryomov1-2/+3
With exclusive-lock added and more to come, print features into dmesg. Change capacity to decimal while at it. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com>
2016-08-25rbd: support for exclusive-lock featureIlya Dryomov3-16/+808
Add basic support for RBD_FEATURE_EXCLUSIVE_LOCK feature. Maintenance operations (resize, snapshot create, etc) are offloaded to librbd via returning -EOPNOTSUPP - librbd should request the lock and execute the operation. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Tested-by: Mike Christie <mchristi@redhat.com>
2016-08-25rbd: retry watch re-registration periodicallyIlya Dryomov2-29/+110
Revamp watch code to support retrying watch re-registration: - add rbd_dev->watch_state for more robust errcb handling - store watch cookie separately to avoid dereferencing watch_handle which is set to NULL on unwatch - move re-register code into a delayed work and retry re-registration every second, unless the client is blacklisted Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Tested-by: Mike Christie <mchristi@redhat.com>
2016-08-25rbd: introduce a per-device ordered workqueueIlya Dryomov1-80/+71
This is going to be used for re-registering watch requests and exclusive-lock tasks: acquire/request lock, notify-acquired, release lock, notify-released. Some refactoring in the map/unmap paths was necessary to give this workqueue a meaningful name: "rbdX-tasks". Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com>
2016-08-25libceph: rename ceph_client_id() -> ceph_client_gid()Ilya Dryomov3-5/+6
It's gid / global_id in other places. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2016-08-25libceph: support for blacklisting clientsDouglas Fuller3-0/+96
Reuse ceph_mon_generic_request infrastructure for sending monitor commands. In particular, add support for 'blacklist add' to prevent other, non-responsive clients from making further updates. Signed-off-by: Douglas Fuller <dfuller@redhat.com> [idryomov@gmail.com: refactor, misc fixes throughout] Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2016-08-25libceph: support for lock.lock_infoDouglas Fuller2-0/+167
Add an interface for the Ceph OSD lock.lock_info method and associated data structures. Based heavily on code by Mike Christie <michaelc@cs.wisc.edu>. Signed-off-by: Douglas Fuller <dfuller@redhat.com> [idryomov@gmail.com: refactor, misc fixes throughout] Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2016-08-25libceph: support for advisory locking on RADOS objectsDouglas Fuller3-0/+208
This patch adds support for rados lock, unlock and break lock. Based heavily on code by Mike Christie <michaelc@cs.wisc.edu>. Signed-off-by: Douglas Fuller <dfuller@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2016-08-25libceph: add ceph_osdc_call() single-page helperDouglas Fuller2-0/+59
Add a convenience function to osd_client to send Ceph OSD 'class' ops. The interface assumes that the request and reply data each consist of single pages. Signed-off-by: Douglas Fuller <dfuller@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2016-08-25libceph: support for CEPH_OSD_OP_LIST_WATCHERSDouglas Fuller2-1/+131
Add support for this Ceph OSD op, needed to support the RBD exclusive lock feature. Signed-off-by: Douglas Fuller <dfuller@redhat.com> [idryomov@gmail.com: refactor, misc fixes throughout] Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2016-08-25libceph: rename ceph_entity_name_encode() -> ceph_auth_entity_name_encode()Ilya Dryomov3-4/+7
Clear up EntityName vs entity_name_t confusion. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2016-08-15Linux 4.8-rc2Linus Torvalds1-1/+1
2016-08-15Merge branch 'next' of ↵Linus Torvalds9-8/+84
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal updates from Zhang Rui: - Fix a race condition when updating cooling device, which may lead to a situation where a thermal governor never updates the cooling device. From Michele Di Giorgio. - Fix a zero division error when disabling the forced idle injection from the intel powerclamp. From Petr Mladek. - Add suspend/resume callback for intel_pch_thermal thermal driver. From Srinivas Pandruvada. - Another two fixes for clocking cooling driver and hwmon sysfs I/F. From Michele Di Giorgio and Kuninori Morimoto. [ Hmm. That suspend/resume callback for intel_pch_thermal doesn't look like a fix, but I'm letting it slide.. - Linus ] * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: clock_cooling: Fix missing mutex_init() thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs thermal: fix race condition when updating cooling device thermal/powerclamp: Prevent division by zero when counting interval thermal: intel_pch_thermal: Add suspend/resume callback
2016-08-15Merge branch 'for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu fix from Greg Ungerer: "This contains only a single fix for a register corruption problem on certain types of m68k flat format binaries" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68knommu: fix user a5 register being overwritten
2016-08-14Merge tag 'fixes-for-linus-4.8' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull h8300 and unicore32 architecture fixes from Guenter Roeck: "Two patches to fix h8300 and unicore32 builds. unicore32 builds have been broken since v4.6. The fix has been available in -next since March of this year. h8300 builds have been broken since the last commit window. The fix has been available in -next since June of this year" * tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: h8300: Add missing include file to asm/io.h unicore32: mm: Add missing parameter to arch_vma_access_permitted
2016-08-14Merge tag 'arm64-fixes' of ↵Linus Torvalds8-86/+136
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - support for nr_cpus= command line argument (maxcpus was previously changed to allow secondary CPUs to be hot-plugged) - ARM PMU interrupt handling fix - fix potential TLB conflict in the hibernate code - improved handling of EL1 instruction aborts (better error reporting) - removal of useless jprobes code for stack saving/restoring - defconfig updates * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO arm64: defconfig: add options for virtualization and containers arm64: hibernate: handle allocation failures arm64: hibernate: avoid potential TLB conflict arm64: Handle el1 synchronous instruction aborts cleanly arm64: Remove stack duplicating code from jprobes drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property drivers/perf: arm-pmu: convert arm_pmu_mutex to spinlock arm64: Support hard limit of cpu count by nr_cpus
2016-08-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds8-53/+118
Pull KVM fixes from Radim Krčmář: "KVM: - lock kvm_device list to prevent corruption on device creation. PPC: - split debugfs initialization from creation of the xics device to unlock the newly taken kvm lock earlier. s390: - prevent userspace from triggering two WARN_ON_ONCE. MIPS: - fix several issues in the management of TLB faults (Cc: stable)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: MIPS: KVM: Propagate kseg0/mapped tlb fault errors MIPS: KVM: Fix gfn range check in kseg0 tlb faults MIPS: KVM: Add missing gfn range check MIPS: KVM: Fix mapped fault broken commpage handling KVM: Protect device ops->create and list_add with kvm->lock KVM: PPC: Move xics_debugfs_init out of create KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed KVM: s390: set the prefix initially properly
2016-08-13Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds10-85/+160
Pull block fixes from Jens Axboe: - an NVMe fix from Gabriel, fixing a suspend/resume issue on some setups - addition of a few missing entries in the block queue sysfs documentation, from Joe - a fix for a sparse shadow warning for the bvec iterator, from Johannes - a writeback deadlock involving raid issuing barriers, and not flushing the plug when we wakeup the flusher threads. From Konstantin - a set of patches for the NVMe target/loop/rdma code, from Roland and Sagi * 'for-linus' of git://git.kernel.dk/linux-block: bvec: avoid variable shadowing warning doc: update block/queue-sysfs.txt entries nvme: Suspend all queues before deletion mm, writeback: flush plugged IO in wakeup_flusher_threads() nvme-rdma: Remove unused includes nvme-rdma: start async event handler after reconnecting to a controller nvmet: Fix controller serial number inconsistency nvmet-rdma: Don't use the inline buffer in order to avoid allocation for small reads nvmet-rdma: Correctly handle RDMA device hot removal nvme-rdma: Make sure to shutdown the controller if we can nvme-loop: Remove duplicate call to nvme_remove_namespaces nvme-rdma: Free the I/O tags when we delete the controller nvme-rdma: Remove duplicate call to nvme_remove_namespaces nvme-rdma: Fix device removal handling nvme-rdma: Queue ns scanning after a sucessful reconnection nvme-rdma: Don't leak uninitialized memory in connect request private data
2016-08-13h8300: Add missing include file to asm/io.hGuenter Roeck1-0/+2
h8300 builds fail with arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’ arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’ arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’ and many related errors. Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix") Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-08-13unicore32: mm: Add missing parameter to arch_vma_access_permittedGuenter Roeck1-1/+1
unicore32 fails to compile with the following errors. mm/memory.c: In function ‘__handle_mm_fault’: mm/memory.c:3381: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘check_vma_flags’: mm/gup.c:456: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘vma_permits_fault’: mm/gup.c:640: error: too many arguments to function ‘arch_vma_access_permitted’ Fixes: d61172b4b695b ("mm/core, x86/mm/pkeys: Differentiate instruction fetches") Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
2016-08-13Merge tag 'vfio-v4.8-rc2' of git://github.com/awilliam/linux-vfioLinus Torvalds1-36/+49
Pull VFIO fix from Alex Williamson: "Fix oops when dereferencing empty data (Alex Williamson)" * tag 'vfio-v4.8-rc2' of git://github.com/awilliam/linux-vfio: vfio/pci: Fix NULL pointer oops in error interrupt setup handling
2016-08-13Merge tag 'nfsd-4.8-1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2-20/+54
Pull nfsd fixes from Bruce Fields: "Fixes for the dentry refcounting leak I introduced in 4.8-rc1, and for races in the LOCK code which appear to go back to the big nfsd state lock removal from 3.17" * tag 'nfsd-4.8-1' of git://linux-nfs.org/~bfields/linux: nfsd: don't return an unhashed lock stateid after taking mutex nfsd: Fix race between FREE_STATEID and LOCK nfsd: fix dentry refcounting on create
2016-08-13Merge tag 'pm-4.8-rc2' of ↵Linus Torvalds5-14/+36
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Two hibernation fixes allowing it to work with the recently added randomization of the kernel identity mapping base on x86-64 and one cpufreq driver regression fix. Specifics: - Fix the x86 identity mapping creation helpers to avoid the assumption that the base address of the mapping will always be aligned at the PGD level, as it may be aligned at the PUD level if address space randomization is enabled (Rafael Wysocki). - Fix the hibernation core to avoid executing tracing functions before restoring the processor state completely during resume (Thomas Garnier). - Fix a recently introduced regression in the powernv cpufreq driver that causes it to crash due to an out-of-bounds array access (Akshay Adiga)" * tag 'pm-4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / hibernate: Restore processor state before using per-CPU variables x86/power/64: Always create temporary identity mapping correctly cpufreq: powernv: Fix crash in gpstate_timer_handler()
2016-08-13Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds20-195/+182
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "This is bigger than usual - the reason is partly a pent-up stream of fixes after the merge window and partly accidental. The fixes are: - five patches to fix a boot failure on Andy Lutomirsky's laptop - four SGI UV platform fixes - KASAN fix - warning fix - documentation update - swap entry definition fix - pkeys fix - irq stats fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic/x2apic, smp/hotplug: Don't use before alloc in x2apic_cluster_probe() x86/efi: Allocate a trampoline if needed in efi_free_boot_services() x86/boot: Rework reserve_real_mode() to allow multiple tries x86/boot: Defer setup_real_mode() to early_initcall time x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly x86/boot: Run reserve_bios_regions() after we initialize the memory map x86/irq: Do not substract irq_tlb_count from irq_call_count x86/mm: Fix swap entry comment and macro x86/mm/kaslr: Fix -Wformat-security warning x86/mm/pkeys: Fix compact mode by removing protection keys' XSAVE buffer manipulation x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables x86/platform/UV: Fix kernel panic running RHEL kdump kernel on UV systems x86/platform/UV: Fix problem with UV4 BIOS providing incorrect PXM values x86/platform/UV: Fix bug with iounmap() of the UV4 EFI System Table causing a crash x86/platform/UV: Fix problem with UV4 Socket IDs not being contiguous x86/entry: Clarify the RF saving/restoring situation with SYSCALL/SYSRET x86/mm: Disable preemption during CR3 read+write x86/mm/KASLR: Increase BRK pages for KASLR memory randomization x86/mm/KASLR: Fix physical memory calculation on KASLR memory randomization x86, kasan, ftrace: Put APIC interrupt handlers into .irqentry.text
2016-08-12Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds6-7/+60
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Misc fixes: a /dev/rtc regression fix, two APIC timer period calibration fixes, an ARM clocksource driver fix and a NOHZ power use regression fix" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hpet: Fix /dev/rtc breakage caused by RTC cleanup x86/timers/apic: Inform TSC deadline clockevent device about recalibration x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error timers: Fix get_next_timer_interrupt() computation clocksource/arm_arch_timer: Force per-CPU interrupt to be level-triggered
2016-08-12Merge branches 'pm-sleep' and 'pm-cpufreq'Rafael J. Wysocki5-14/+36
* pm-sleep: PM / hibernate: Restore processor state before using per-CPU variables x86/power/64: Always create temporary identity mapping correctly * pm-cpufreq: cpufreq: powernv: Fix crash in gpstate_timer_handler()
2016-08-12Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds5-4/+34
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixes: cputime fixes, two deadline scheduler fixes and a cgroups scheduling fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/cputime: Fix omitted ticks passed in parameter sched/cputime: Fix steal time accounting sched/deadline: Fix lock pinning warning during CPU hotplug sched/cputime: Mitigate performance regression in times()/clock_gettime() sched/fair: Fix typo in sync_throttle() sched/deadline: Fix wrap-around in DL heap
2016-08-12PM / hibernate: Restore processor state before using per-CPU variablesThomas Garnier1-2/+2
Restore the processor state before calling any other functions to ensure per-CPU variables can be used with KASLR memory randomization. Tracing functions use per-CPU variables (GS based on x86) and one was called just before restoring the processor state fully. It resulted in a double fault when both the tracing & the exception handler functions tried to use a per-CPU variable. Fixes: bb3632c6101b (PM / sleep: trace events for suspend/resume) Reported-and-tested-by: Borislav Petkov <bp@suse.de> Reported-by: Jiri Kosina <jikos@kernel.org> Tested-by: Rafael J. Wysocki <rafael@kernel.org> Tested-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Garnier <thgarnie@google.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-12Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds19-93/+298
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, plus two uncore-PMU fixes, an uprobes fix, a perf-cgroups fix and an AUX events fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Add enable_box for client MSR uncore perf/x86/intel/uncore: Fix uncore num_counters uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions perf/core: Set cgroup in CPU contexts for new cgroup events perf/core: Fix sideband list-iteration vs. event ordering NULL pointer deference crash perf probe ppc64le: Fix probe location when using DWARF perf probe: Add function to post process kernel trace events tools: Sync cpufeatures headers with the kernel toops: Sync tools/include/uapi/linux/bpf.h with the kernel tools: Sync cpufeatures.h and vmx.h with the kernel perf probe: Support signedness casting perf stat: Avoid skew when reading events perf probe: Fix module name matching perf probe: Adjust map->reloc offset when finding kernel symbol from map perf hists: Trim libtraceevent trace_seq buffers perf script: Add 'bpf-output' field to usage message
2016-08-12nfsd: don't return an unhashed lock stateid after taking mutexJeff Layton1-5/+20
nfsd4_lock will take the st_mutex before working with the stateid it gets, but between the time when we drop the cl_lock and take the mutex, the stateid could become unhashed (a'la FREE_STATEID). If that happens the lock stateid returned to the client will be forgotten. Fix this by first moving the st_mutex acquisition into lookup_or_create_lock_state. Then, have it check to see if the lock stateid is still hashed after taking the mutex. If it's not, then put the stateid and try the find/create again. Signed-off-by: Jeff Layton <jlayton@redhat.com> Tested-by: Alexey Kodanev <alexey.kodanev@oracle.com> Cc: stable@vger.kernel.org # feb9dad5 nfsd: Always lock state exclusively. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-08-12Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds4-5/+48
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Misc fixes: lockstat fix, futex fix on !MMU systems, big endian fix for qrwlocks and a race fix for pvqspinlocks" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/pvqspinlock: Fix a bug in qstat_read() locking/pvqspinlock: Fix double hash race locking/qrwlock: Fix write unlock bug on big endian systems futex: Assume all mappings are private on !MMU systems
2016-08-12Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds3-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "A fix for an MSI regression" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/msi: Make sure PCI MSIs are activated early