summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-05-28btrfs: cleanup __btrfs_open_devices() drop head pointerAnand Jain1-2/+1
__btrfs_open_devices() declares struct list_head *head, however head is used only once, instead use btrfs_fs_devices::devices directly. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: rename struct btrfs_fs_devices::listAnand Jain3-10/+10
btrfs_fs_devices::list is the list of BTRFS fsid in the kernel, a generic name 'list' makes it's search very difficult, rename it to fs_list. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Drop fs_info parameter from btrfs_merge_delayed_refsNikolay Borisov3-4/+2
It's provided by the transaction handle. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Drop fs_info parameter from add_delayed_data_refNikolay Borisov1-7/+5
It's provided by the transaction handle. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Drop add_delayed_ref_head fs_info parameterNikolay Borisov1-11/+10
It's provided by the transaction handle. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove btrfs_wait_and_free_delalloc_workNikolay Borisov2-8/+2
This function is called from only 1 place and is effectively a wrapper over wait_completion/kfree. It doesn't really bring any value having those two calls in a separate function. Just open code it and remove it. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove tree argument from extent_writepagesNikolay Borisov3-9/+4
It can be directly referenced from the passed address_space so do that. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Use list_empty instead of list_empty_carefulNikolay Borisov1-2/+2
list_empty_careful usually is a signal of something tricky going on. Its usage in btrfs is actually not needed since both lists it's used on are local to a function and cannot be modified concurrently. So switch to plain list_empty. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove redundant tree argument from extent_readpagesNikolay Borisov3-9/+7
This function is called only from btrfs_readpage and is already passed the mapping. Simplify its signature by moving the code obtaining reference to the extent tree in the function. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove map argument from try_release_extent_stateNikolay Borisov1-3/+2
It's not used in the function so just remove it. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Sink extent_tree arguments in try_release_extent_mappingNikolay Borisov3-13/+5
This function already gets the page from which the two extent trees are referenced. Simplify its signature by moving the code getting the trees inside the function. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Allow rmdir(2) to delete an empty subvolumeMisono Tomohiro1-1/+1
Change the behavior of rmdir(2) and allow it to delete an empty subvolume by using btrfs_delete_subvolume() which is used by btrfs_ioctl_snap_destroy(). This is a change in behaviour and has been requested by users. Deleting the subvolume by ioctl requires root permissions while the rmdir way does works with standard tools and syscalls for all users that can access the subvolume. The main usecase is to allow 'rm -rf /path/with/subvols' to simply work. We were not able to find any nasty usability surprises, the intention is to do the destructive rm. Without allowing rmdir, this would have to be followed by the ioctl subvolume deletion, which is more of an annoyance. Implementation details: The required lock for @dir and inode of @dentry is already acquired in vfs layer. We need some check before deleting a subvolume. Permission check is done in vfs layer, emptiness check is in btrfs_rmdir() and additional check (i.e. neither the subvolume is a default subvolume nor send is in progress) is in btrfs_delete_subvolume(). Note that in btrfs_ioctl_snap_destroy(), d_delete() is called after btrfs_delete_subvolume(). For rmdir(2), d_delete() is called in vfs layer later. Tested-by: Goffredo Baroncelli <kreijack@inwind.it> Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> [ enhance changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Factor out the main deletion process from btrfs_ioctl_snap_destroy()Misono Tomohiro3-136/+144
Factor out the second half of btrfs_ioctl_snap_destroy() as btrfs_delete_subvolume(), which performs some subvolume specific checks before deletion: 1. send is not in progress 2. the subvolume is not the default subvolume 3. the subvolume does not contain other subvolumes and actual deletion process. btrfs_delete_subvolume() requires inode_lock for both @dir and inode of @dentry. The remaining part of btrfs_ioctl_snap_destroy() is mainly permission checks. Note that call of d_delete() is not included in btrfs_delete_subvolume() as this function will also be used by btrfs_rmdir() to delete an empty subvolume and in that case d_delete() is called in VFS layer. As a result, btrfs_unlink_subvol() and may_destroy_subvol() become static functions. No functional changes. Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> [ minor comment updates ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Move may_destroy_subvol() from ioctl.c to inode.cMisono Tomohiro3-54/+56
This is a preparation work to refactor btrfs_ioctl_snap_destroy() and to allow rmdir(2) to delete an empty subvolume. Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> [ minor update of the function comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: remove unused le_test_bit()Howard McLauchlan1-5/+0
With commit b18253ec57c0 ("btrfs: optimize free space tree bitmap conversion"), there are no more callers to le_test_bit(). This patch removes le_test_bit(). Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: optimize free space tree bitmap conversionHoward McLauchlan1-38/+23
Presently, convert_free_space_to_extents() does a linear scan of the bitmap. We can speed this up with find_next_{bit,zero_bit}_le(). This patch replaces the linear scan with find_next_{bit,zero_bit}_le(). Testing shows a 20-33% decrease in execution time for convert_free_space_to_extents(). Since we change bitmap to be unsigned long, we have to do some casting for the bitmap cursor. In le_bitmap_set() it makes sense to use u8, as we are doing bit operations. Everywhere else, we're just using it for pointer arithmetic and not directly accessing it, so char seems more appropriate. Suggested-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: clean up le_bitmap_{set, clear}()Howard McLauchlan3-43/+20
le_bitmap_set() is only used by free-space-tree, so move it there and make it static. le_bitmap_clear() is not used, so remove it. Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: use fs_info for btrfs_handle_em_exist tracepointDavid Sterba5-13/+18
We really want to know to which filesystem the extent map events belong, but as it cannot be reached from the extent_map pointers, we need to pass it down the callchain. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: tests: pass fs_info to extent_map testsDavid Sterba1-16/+36
Preparatory work to pass fs_info to btrfs_add_extent_mapping so we can get a better tracepoint message. Extent maps do not need fs_info for anything so we only add a dummy one without any other initialization. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: tracepoints, use extended format with UUID where possibleDavid Sterba1-18/+12
Most of the strings are prefixed by the UUID of the filesystem that generates the message, however there are a few events that still opencode the macro magic and can be converted to the common macros. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: tracepoints, fix whitespace in stringsDavid Sterba1-2/+2
The preferred style is to avoid spaces between key and value and no commas between key=values. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: tracepoints, drop unnecessary ULL castsDavid Sterba1-62/+62
The (unsigned long long) casts are not necessary since long ago. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: tracepoints, use %llu instead of %LuDavid Sterba1-9/+9
For consistency, use the %llu form. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: tracepoints, use correct type for inode numberDavid Sterba1-23/+24
The size of ino_t depends on 32/64bit architecture type. Btrfs stores the full 64bit inode anyway so we should use it. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Consolidate error checking for btrfs_alloc_chunkNikolay Borisov1-5/+7
The second if is really a subcase of ret being less than 0. So introduce a generic if (ret < 0) check, and inside have another if which explicitly handles the -ENOSPC and any other errors. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Fix lock release orderNikolay Borisov1-1/+1
Locks should generally be released in the oppposite order they are acquired. Generally lock acquisiton ordering is used to ensure deadlocks don't happen. However, as becomes more complicated it's best to also maintain proper unlock order so as to avoid possible dead locks. This was found by code inspection and doesn't necessarily lead to a deadlock scenario. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Use while loop instead of labels in __endio_write_update_orderedNikolay Borisov1-27/+25
Currently __endio_write_update_ordered uses labels to implement what is essentially a simple while loop. This makes the code more cumbersome to follow than it actually has to be. No functional changes. No xfstest regressions were found during testing. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: add comment about BTRFS_FS_EXCL_OPAnand Jain1-0/+35
Adds comments about BTRFS_FS_EXCL_OP to existing comments about the device locks. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ minor updates ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Drop delayed_refs argument from btrfs_check_delayed_seqNikolay Borisov3-10/+5
It's used to print its pointer in a debug statement but doesn't really bring any useful information to the error message. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: rename btrfs_get_block_group_info and make it staticSu Yue2-6/+4
The function btrfs_get_block_group_info() was introduced by the commit 5af3e8cce8b7 ("Btrfs: make filesystem read-only when submitting barrier fails") which used it in disk-io.c. However, the function is only called in ioctl.c now. Its parameter type btrfs_ioctl_space_info* is only for ioctl. So, make it static and rename it to be original name get_block_group_info. No functional change. Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Replace owner argument in add_pinned_bytes with a booleanNikolay Borisov1-7/+13
add_pinned_bytes really cares whether the bytes being pinned are either data or metadata. To that effect it checks whether the 'owner' argument is less than BTRFS_FIRST_FREE_OBJECTID (256). This works because owner can really have 2 types of values: a) For metadata extents it holds the level at which the parent is in the btree. This amounts to owner having the values 0-7 b) In case of modifying data extents, owner is the inode number to which those extents belongs. Let's make this more explicit byt converting the owner parameter to a boolean value and either pass it directly when we know the type of extents we are working with (i.e. in btrfs_free_tree_block). In cases when the parent function can be called on both metadata/data extents perform the check in the caller. This hopefully makes the interface of add_pinned_bytes more intuitive. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-27Linux 4.17-rc7Linus Torvalds1-1/+1
2018-05-27Merge tag 'kbuild-fixes-v4.17-2' of ↵Linus Torvalds1-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild fixes from Masahiro Yamada: - enable '-fno-tree-loop-im' only when supported - add '-fno-PIE' option before the asm-goto test * tag 'kbuild-fixes-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: Makefile: disable PIE before testing asm goto kbuild: gcov: enable -fno-tree-loop-im if supported
2018-05-27Merge tag 'armsoc-fixes' of ↵Linus Torvalds15-20/+20
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "A few more fixes for v4.17: - a fix for a crash in scm_call_atomic on qcom platforms - display fix for Allwinner A10 - a fix that re-enables ethernet on Allwinner H3 (C.H.I.P et al) - a fix for eMMC corruption on hikey - i2c-gpio descriptor tables for ixp4xx ... plus a small typo fix" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: Fix i2c-gpio GPIO descriptor tables arm64: dts: hikey: Fix eMMC corruption regression firmware: qcom: scm: Fix crash in qcom_scm_call_atomic1() ARM: sun8i: v3s: fix spelling mistake: "disbaled" -> "disabled" ARM: dts: sun4i: Fix incorrect clocks for displays ARM: dts: sun8i: h3: Re-enable EMAC on Orange Pi One
2018-05-26Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds2-17/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 store buffer fixes from Thomas Gleixner: "Two fixes for the SSBD mitigation code: - expose SSBD properly to guests. This got broken when the CPU feature flags got reshuffled. - simplify the CPU detection logic to avoid duplicate entries in the tables" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Simplify the CPU bug detection logic KVM/VMX: Expose SSBD properly to guests
2018-05-26Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds3-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "Three fixes for scheduler and kthread code: - allow calling kthread_park() on an already parked thread - restore the sched_pi_setprio() tracepoint behaviour - clarify the unclear string for the scheduling domain debug output" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched, tracing: Fix trace_sched_pi_setprio() for deboosting kthread: Allow kthread_park() on a parked kthread sched/topology: Clarify root domain(s) debug string
2018-05-26Merge tag 'hisi-fixes-for-4.17v2' of git://github.com/hisilicon/linux-hisi ↵Olof Johansson1-1/+0
into fixes ARM64: hisi fixes for 4.17 - Remove eMMC max-frequency property to fix eMMC corruption on hikey board * tag 'hisi-fixes-for-4.17v2' of git://github.com/hisilicon/linux-hisi: arm64: dts: hikey: Fix eMMC corruption regression Signed-off-by: Olof Johansson <olof@lixom.net>
2018-05-26ARM: Fix i2c-gpio GPIO descriptor tablesLinus Walleij10-11/+11
I used bad names in my clumsiness when rewriting many board files to use GPIO descriptors instead of platform data. A few had the platform_device ID set to -1 which would indeed give the device name "i2c-gpio". But several had it set to >=0 which gives the names "i2c-gpio.0", "i2c-gpio.1" ... Fix the offending instances in the ARM tree. Sorry for the mess. Fixes: b2e63555592f ("i2c: gpio: Convert to use descriptors") Cc: Wolfram Sang <wsa@the-dreams.de> Cc: Simon Guinot <simon.guinot@sequanux.org> Reported-by: Simon Guinot <simon.guinot@sequanux.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2018-05-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds11-75/+198
Pull KVM fixes from Radim Krčmář: "PPC: - Close a hole which could possibly lead to the host timebase getting out of sync. - Three fixes relating to PTEs and TLB entries for radix guests. - Fix a bug which could lead to an interrupt never getting delivered to the guest, if it is pending for a guest vCPU when the vCPU gets offlined. s390: - Fix false negatives in VSIE validity check (Cc stable) x86: - Fix time drift of VMX preemption timer when a guest uses LAPIC timer in periodic mode (Cc stable) - Unconditionally expose CPUID.IA32_ARCH_CAPABILITIES to allow migration from hosts that don't need retpoline mitigation (Cc stable) - Fix guest crashes on reboot by properly coupling CR4.OSXSAVE and CPUID.OSXSAVE (Cc stable) - Report correct RIP after Hyper-V hypercall #UD (introduced in -rc6)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: fix #UD address of failed Hyper-V hypercalls kvm: x86: IA32_ARCH_CAPABILITIES is always supported KVM: x86: Update cpuid properly when CR4.OSXAVE or CR4.PKE is changed x86/kvm: fix LAPIC timer drift when guest uses periodic mode KVM: s390: vsie: fix < 8k check for the itdba KVM: PPC: Book 3S HV: Do ptesync in radix guest exit path KVM: PPC: Book3S HV: XIVE: Resend re-routed interrupts on CPU priority change KVM: PPC: Book3S HV: Make radix clear pte when unmapping KVM: PPC: Book3S HV: Make radix use correct tlbie sequence in kvmppc_radix_tlbie_page KVM: PPC: Book3S HV: Snapshot timebase offset on guest entry
2018-05-26arm64: dts: hikey: Fix eMMC corruption regressionJohn Stultz1-1/+0
This patch is a partial revert of commit abd7d0972a19 ("arm64: dts: hikey: Enable HS200 mode on eMMC") which has been causing eMMC corruption on my HiKey board. Symptoms usually looked like: mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31) ... mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0) mmc0: new HS200 MMC card at address 0001 ... dwmmc_k3 f723d000.dwmmc0: Unexpected command timeout, state 3 mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31) mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0) mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31) mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0) mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31) mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0) print_req_error: I/O error, dev mmcblk0, sector 8810504 Aborting journal on device mmcblk0p10-8. mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31) mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0) mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31) mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0) mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31) mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0) mmc_host mmc0: Bus speed (slot 0) = 24800000Hz (slot req 400000Hz, actual 400000HZ div = 31) mmc_host mmc0: Bus speed (slot 0) = 148800000Hz (slot req 150000000Hz, actual 148800000HZ div = 0) EXT4-fs error (device mmcblk0p10): ext4_journal_check_start:61: Detected aborted journal EXT4-fs (mmcblk0p10): Remounting filesystem read-only And quite often this would result in a disk that wouldn't properly boot even with older kernels. It seems the max-frequency property added by the above patch is causing the problem, so remove it. Cc: Ryan Grachek <ryan@edited.us> Cc: Wei Xu <xuwei5@hisilicon.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: YongQin Liu <yongqin.liu@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Wei Xu <xuwei04@gmail.com>
2018-05-26Merge branch 'akpm' (patches from Andrew)Linus Torvalds16-43/+125
Merge misc fixes from Andrew Morton: "16 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: kasan: fix memory hotplug during boot kasan: free allocated shadow memory on MEM_CANCEL_ONLINE checkpatch: fix macro argument precedence test init/main.c: include <linux/mem_encrypt.h> kernel/sys.c: fix potential Spectre v1 issue mm/memory_hotplug: fix leftover use of struct page during hotplug proc: fix smaps and meminfo alignment mm: do not warn on offline nodes unless the specific node is explicitly requested mm, memory_hotplug: make has_unmovable_pages more robust mm/kasan: don't vfree() nonexistent vm_area MAINTAINERS: change hugetlbfs maintainer and update files ipc/shm: fix shmat() nil address after round-down when remapping Revert "ipc/shm: Fix shmat mmap nil-page protection" idr: fix invalid ptr dereference on item delete ocfs2: revert "ocfs2/o2hb: check len for bio_add_page() to avoid getting incorrect bio" mm: fix nr_rotate_swap leak in swapon() error case
2018-05-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds49-193/+372
Pull networking fixes from David Miller: "Let's begin the holiday weekend with some networking fixes: 1) Whoops need to restrict cfg80211 wiphy names even more to 64 bytes. From Eric Biggers. 2) Fix flags being ignored when using kernel_connect() with SCTP, from Xin Long. 3) Use after free in DCCP, from Alexey Kodanev. 4) Need to check rhltable_init() return value in ipmr code, from Eric Dumazet. 5) XDP handling fixes in virtio_net from Jason Wang. 6) Missing RTA_TABLE in rtm_ipv4_policy[], from Roopa Prabhu. 7) Need to use IRQ disabling spinlocks in mlx4_qp_lookup(), from Jack Morgenstein. 8) Prevent out-of-bounds speculation using indexes in BPF, from Daniel Borkmann. 9) Fix regression added by AF_PACKET link layer cure, from Willem de Bruijn. 10) Correct ENIC dma mask, from Govindarajulu Varadarajan. 11) Missing config options for PMTU tests, from Stefano Brivio" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits) ibmvnic: Fix partial success login retries selftests/net: Add missing config options for PMTU tests mlx4_core: allocate ICM memory in page size chunks enic: set DMA mask to 47 bit ppp: remove the PPPIOCDETACH ioctl ipv4: remove warning in ip_recv_error net : sched: cls_api: deal with egdev path only if needed vhost: synchronize IOTLB message with dev cleanup packet: fix reserve calculation net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands net/mlx5e: When RXFCS is set, add FCS data into checksum calculation bpf: properly enforce index mask to prevent out-of-bounds speculation net/mlx4: Fix irq-unsafe spinlock usage net: phy: broadcom: Fix bcm_write_exp() net: phy: broadcom: Fix auxiliary control register reads net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase message ibmvnic: Only do H_EOI for mobility events tuntap: correctly set SOCKWQ_ASYNC_NOSPACE virtio-net: fix leaking page for gso packet during mergeable XDP ...
2018-05-26kasan: fix memory hotplug during bootDavid Hildenbrand1-1/+1
Using module_init() is wrong. E.g. ACPI adds and onlines memory before our memory notifier gets registered. This makes sure that ACPI memory detected during boot up will not result in a kernel crash. Easily reproducible with QEMU, just specify a DIMM when starting up. Link: http://lkml.kernel.org/r/20180522100756.18478-3-david@redhat.com Fixes: 786a8959912e ("kasan: disable memory hotplug") Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-26kasan: free allocated shadow memory on MEM_CANCEL_ONLINEDavid Hildenbrand1-0/+1
We have to free memory again when we cancel onlining, otherwise a later onlining attempt will fail. Link: http://lkml.kernel.org/r/20180522100756.18478-2-david@redhat.com Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug") Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-26checkpatch: fix macro argument precedence testJoe Perches1-1/+1
checkpatch's macro argument precedence test is broken so fix it. Link: http://lkml.kernel.org/r/5dd900e9197febc1995604bb33c23c136d8b33ce.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-26init/main.c: include <linux/mem_encrypt.h>Mathieu Malaterre1-0/+1
In commit c7753208a94c ("x86, swiotlb: Add memory encryption support") a call to function `mem_encrypt_init' was added. Include prototype defined in header <linux/mem_encrypt.h> to prevent a warning reported during compilation with W=1: init/main.c:494:20: warning: no previous prototype for `mem_encrypt_init' [-Wmissing-prototypes] Link: http://lkml.kernel.org/r/20180522195533.31415-1-malat@debian.org Signed-off-by: Mathieu Malaterre <malat@debian.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Gargi Sharma <gs051095@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-26kernel/sys.c: fix potential Spectre v1 issueGustavo A. R. Silva1-0/+5
`resource' can be controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: kernel/sys.c:1474 __do_compat_sys_old_getrlimit() warn: potential spectre issue 'get_current()->signal->rlim' (local cap) kernel/sys.c:1455 __do_sys_old_getrlimit() warn: potential spectre issue 'get_current()->signal->rlim' (local cap) Fix this by sanitizing *resource* before using it to index current->signal->rlim Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Link: http://lkml.kernel.org/r/20180515030038.GA11822@embeddedor.com Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-26mm/memory_hotplug: fix leftover use of struct page during hotplugJonathan Cameron3-6/+9
The case of a new numa node got missed in avoiding using the node info from page_struct during hotplug. In this path we have a call to register_mem_sect_under_node (which allows us to specify it is hotplug so don't change the node), via link_mem_sections which unfortunately does not. Fix is to pass check_nid through link_mem_sections as well and disable it in the new numa node path. Note the bug only 'sometimes' manifests depending on what happens to be in the struct page structures - there are lots of them and it only needs to match one of them. The result of the bug is that (with a new memory only node) we never successfully call register_mem_sect_under_node so don't get the memory associated with the node in sysfs and meminfo for the node doesn't report it. It came up whilst testing some arm64 hotplug patches, but appears to be universal. Whilst I'm triggering it by removing then reinserting memory to a node with no other elements (thus making the node disappear then appear again), it appears it would happen on hotplugging memory where there was none before and it doesn't seem to be related the arm64 patches. These patches call __add_pages (where most of the issue was fixed by Pavel's patch). If there is a node at the time of the __add_pages call then all is well as it calls register_mem_sect_under_node from there with check_nid set to false. Without a node that function returns having not done the sysfs related stuff as there is no node to use. This is expected but it is the resulting path that fails... Exact path to the problem is as follows: mm/memory_hotplug.c: add_memory_resource() The node is not online so we enter the 'if (new_node)' twice, on the second such block there is a call to link_mem_sections which calls into drivers/node.c: link_mem_sections() which calls drivers/node.c: register_mem_sect_under_node() which calls get_nid_for_pfn and keeps trying until the output of that matches the expected node (passed all the way down from add_memory_resource) It is effectively the same fix as the one referred to in the fixes tag just in the code path for a new node where the comments point out we have to rerun the link creation because it will have failed in register_new_memory (as there was no node at the time). (actually that comment is wrong now as we don't have register_new_memory any more it got renamed to hotplug_memory_register in Pavel's patch). Link: http://lkml.kernel.org/r/20180504085311.1240-1-Jonathan.Cameron@huawei.com Fixes: fc44f7f9231a ("mm/memory_hotplug: don't read nid from struct page during hotplug") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-26proc: fix smaps and meminfo alignmentHugh Dickins1-5/+0
The 4.17-rc /proc/meminfo and /proc/<pid>/smaps look ugly: single-digit numbers (commonly 0) are misaligned. Remove seq_put_decimal_ull_width()'s leftover optimization for single digits: it's wrong now that num_to_str() takes care of the width. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805241554210.1326@eggly.anvils Fixes: d1be35cb6f96 ("proc: add seq_put_decimal_ull_width to speed up /proc/pid/smaps") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Andrei Vagin <avagin@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-26mm: do not warn on offline nodes unless the specific node is explicitly ↵Michal Hocko1-1/+1
requested Oscar has noticed that we splat WARNING: CPU: 0 PID: 64 at ./include/linux/gfp.h:467 vmemmap_alloc_block+0x4e/0xc9 [...] CPU: 0 PID: 64 Comm: kworker/u4:1 Tainted: G W E 4.17.0-rc5-next-20180517-1-default+ #66 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Workqueue: kacpi_hotplug acpi_hotplug_work_fn Call Trace: vmemmap_populate+0xf2/0x2ae sparse_mem_map_populate+0x28/0x35 sparse_add_one_section+0x4c/0x187 __add_pages+0xe7/0x1a0 add_pages+0x16/0x70 add_memory_resource+0xa3/0x1d0 add_memory+0xe4/0x110 acpi_memory_device_add+0x134/0x2e0 acpi_bus_attach+0xd9/0x190 acpi_bus_scan+0x37/0x70 acpi_device_hotplug+0x389/0x4e0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x146/0x340 worker_thread+0x47/0x3e0 kthread+0xf5/0x130 ret_from_fork+0x35/0x40 when adding memory to a node that is currently offline. The VM_WARN_ON is just too loud without a good reason. In this particular case we are doing alloc_pages_node(node, GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_NOWARN, order) so we do not insist on allocating from the given node (it is more a hint) so we can fall back to any other populated node and moreover we explicitly ask to not warn for the allocation failure. Soften the warning only to cases when somebody asks for the given node explicitly by __GFP_THISNODE. Link: http://lkml.kernel.org/r/20180523125555.30039-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Oscar Salvador <osalvador@techadventures.net> Tested-by: Oscar Salvador <osalvador@techadventures.net> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>