summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-01-30fs: Don't need to put list_lru into its own cachelineWaiman Long1-4/+5
The list_lru structure is essentially just a pointer to a table of per-node LRU lists. Even if CONFIG_MEMCG_KMEM is defined, the list field is just used for LRU list registration and shrinker_id is set at initialization. Those fields won't need to be touched that often. So there is no point to make the list_lru structures to sit in their own cachelines. Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-30fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb()Waiman Long1-5/+1
The nr_dentry_unused per-cpu counter tracks dentries in both the LRU lists and the shrink lists where the DCACHE_LRU_LIST bit is set. The shrink_dcache_sb() function moves dentries from the LRU list to a shrink list and subtracts the dentry count from nr_dentry_unused. This is incorrect as the nr_dentry_unused count will also be decremented in shrink_dentry_list() via d_shrink_del(). To fix this double decrement, the decrement in the shrink_dcache_sb() function is taken out. Fixes: 4e717f5c1083 ("list_lru: remove special case function list_lru_dispose_all." Cc: stable@kernel.org Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-30ucc_geth: Reset BQL queue when stopping deviceMathias Thore1-0/+2
After a timeout event caused by for example a broadcast storm, when the MAC and PHY are reset, the BQL TX queue needs to be reset as well. Otherwise, the device will exhibit severe performance issues even after the storm has ended. Co-authored-by: David Gounaris <david.gounaris@infinera.com> Signed-off-by: Mathias Thore <mathias.thore@infinera.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVMJosh Poimboeuf6-35/+8
With the following commit: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") ... the hotplug code attempted to detect when SMT was disabled by BIOS, in which case it reported SMT as permanently disabled. However, that code broke a virt hotplug scenario, where the guest is booted with only primary CPU threads, and a sibling is brought online later. The problem is that there doesn't seem to be a way to reliably distinguish between the HW "SMT disabled by BIOS" case and the virt "sibling not yet brought online" case. So the above-mentioned commit was a bit misguided, as it permanently disabled SMT for both cases, preventing future virt sibling hotplugs. Going back and reviewing the original problems which were attempted to be solved by that commit, when SMT was disabled in BIOS: 1) /sys/devices/system/cpu/smt/control showed "on" instead of "notsupported"; and 2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning. I'd propose that we instead consider #1 above to not actually be a problem. Because, at least in the virt case, it's possible that SMT wasn't disabled by BIOS and a sibling thread could be brought online later. So it makes sense to just always default the smt control to "on" to allow for that possibility (assuming cpuid indicates that the CPU supports SMT). The real problem is #2, which has a simple fix: change vmx_vm_init() to query the actual current SMT state -- i.e., whether any siblings are currently online -- instead of looking at the SMT "control" sysfs value. So fix it by: a) reverting the original "fix" and its followup fix: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") bc2d8d262cba ("cpu/hotplug: Fix SMT supported evaluation") and b) changing vmx_vm_init() to query the actual current SMT state -- instead of the sysfs control value -- to determine whether the L1TF warning is needed. This also requires the 'sched_smt_present' variable to exported, instead of 'cpu_smt_control'. Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joe Mario <jmario@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
2019-01-30Merge branch 'net-various-compat-ioctl-fixes'David S. Miller1-19/+63
Johannes Berg says: ==================== various compat ioctl fixes Back a long time ago, I already fixed a few of these by passing the size of the struct ifreq to do_sock_ioctl(). However, Robert found more cases, and now it won't be as simple because we'd have to pass that down all the way to e.g. bond_do_ioctl() which isn't really feasible. Therefore, restore the old code. While looking at why SIOCGIFNAME was broken, I realized that Al had removed that case - which had been handled in an explicit separate function - as well, and looking through his work at the time I saw that bond ioctls were also affected by the erroneous removal. I've restored SIOCGIFNAME and bond ioctls by going through the (now renamed) dev_ifsioc() instead of reintroducing their own helper functions, which I hope is correct but have only tested with SIOCGIFNAME. ==================== Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30net: socket: make bond ioctls go through compat_ifreq_ioctl()Johannes Berg1-4/+4
Same story as before, these use struct ifreq and thus need to be read with the shorter version to not cause faults. Cc: stable@vger.kernel.org Fixes: f92d4fc95341 ("kill bond_ioctl()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30net: socket: fix SIOCGIFNAME in compatJohannes Berg1-1/+2
As reported by Robert O'Callahan in https://bugzilla.kernel.org/show_bug.cgi?id=202273 reverting the previous changes in this area broke the SIOCGIFNAME ioctl in compat again (I'd previously fixed it after his previous report of breakage in https://bugzilla.kernel.org/show_bug.cgi?id=199469). This is obviously because I fixed SIOCGIFNAME more or less by accident. Fix it explicitly now by making it pass through the restored compat translation code. Cc: stable@vger.kernel.org Fixes: 4cf808e7ac32 ("kill dev_ifname32()") Reported-by: Robert O'Callahan <robert@ocallahan.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30Revert "kill dev_ifsioc()"Johannes Berg1-0/+49
This reverts commit bf4405737f9f ("kill dev_ifsioc()"). This wasn't really unused as implied by the original commit, it still handles the copy to/from user differently, and the commit thus caused issues such as https://bugzilla.kernel.org/show_bug.cgi?id=199469 and https://bugzilla.kernel.org/show_bug.cgi?id=202273 However, deviating from a strict revert, rename dev_ifsioc() to compat_ifreq_ioctl() to be clearer as to its purpose and add a comment. Cc: stable@vger.kernel.org Fixes: bf4405737f9f ("kill dev_ifsioc()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30Revert "socket: fix struct ifreq size in compat ioctl"Johannes Berg1-14/+8
This reverts commit 1cebf8f143c2 ("socket: fix struct ifreq size in compat ioctl"), it's a bugfix for another commit that I'll revert next. This is not a 'perfect' revert, I'm keeping some coding style intact rather than revert to the state with indentation errors. Cc: stable@vger.kernel.org Fixes: 1cebf8f143c2 ("socket: fix struct ifreq size in compat ioctl") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30Revert "staging: erofs: keep corrupted fs from crashing kernel in erofs_namei()"Greg Kroah-Hartman1-89/+78
This reverts commit d4104c5e783f5d053b97268fb92001d785de7dd5. Turns out it still needs some more work, I merged it to soon :( Reported-by: Gao Xiang <gaoxiang25@huawei.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30drm/amdgpu: Transfer fences to dmabuf importerChris Wilson1-8/+51
amdgpu only uses shared-fences internally, but dmabuf importers rely on implicit write hazard tracking via the reservation_object.fence_excl. For example, the importer use the write hazard for timing a page flip to only occur after the exporter has finished flushing its write into the surface. As such, on exporting a dmabuf, we must either flush all outstanding fences (for we do not know which are writes and should have been exclusive) or alternatively create a new exclusive fence that is the composite of all the existing shared fences, and so will only be signaled when all earlier fences are signaled (ensuring that we can not be signaled before the completion of any earlier write). v2: reservation_object is already locked by amdgpu_bo_reserve() v3: Replace looping with get_fences_rcu and special case the promotion of a single shared fence directly to an exclusive fence, bypassing the fence array. v4: Drop the fence array ref after assigning to reservation_object Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341 Testcase: igt/amd_prime/amd-to-i915 References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Reviewed-by: "Christian König" <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-01-30Merge tag 'iommu-fixes-v5.0-rc4' of ↵Linus Torvalds3-7/+18
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "A few more fixes this time: - Two patches to fix the error path of the map_sg implementation of the AMD IOMMU driver. - Also a missing IOTLB flush is fixed in the AMD IOMMU driver. - Memory leak fix for the Intel IOMMU driver. - Fix a regression in the Mediatek IOMMU driver which caused device initialization to fail (seen as broken HDMI output)" * tag 'iommu-fixes-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix IOMMU page flush when detach device from a domain iommu/mediatek: Use correct fwspec in mtk_iommu_add_device() iommu/vt-d: Fix memory leak in intel_iommu_put_resv_regions() iommu/amd: Unmap all mapped pages in error path of map_sg iommu/amd: Call free_iova_fast with pfn in map_sg
2019-01-30Merge tag 'gpio-v5.0-3' of ↵Linus Torvalds5-17/+41
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Here is a bunch of GPIO fixes for the v5.0 series. I was helped out by Bartosz in collecting these fixes, for which I am very grateful, the biggest achievement in GPIO right now is work distribution. There is one serious core fix (timestamping) and a bunch of driver fixes: - Fix timestamps on nested IRQs - Handle IRQs properly in multiple instances of PCF857x - Use the right data register and IRQ type setting in the Spreadtrum GPIO driver - Let the value argument work properly when setting direction in the Altera GPIO driver - Mask interrupts properly in the vf610 driver" * tag 'gpio-v5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: vf610: Mask all GPIO interrupts gpio: altera-a10sr: Set proper output level for direction_output gpio: sprd: Fix incorrect irq type setting for the async EIC gpio: sprd: Fix the incorrect data register gpiolib: fix line event timestamps for nested irqs gpio: pcf857x: Fix interrupts on multiple instances
2019-01-30btrfs: On error always free subvol_name in btrfs_mountEric W. Biederman1-0/+3
The subvol_name is allocated in btrfs_parse_subvol_options and is consumed and freed in mount_subvol. Add a free to the error paths that don't call mount_subvol so that it is guaranteed that subvol_name is freed when an error happens. Fixes: 312c89fbca06 ("btrfs: cleanup btrfs_mount() using btrfs_mount_root()") Cc: stable@vger.kernel.org # v4.19+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-30btrfs: clean up pending block groups when transaction commit abortsDavid Sterba1-0/+16
The fstests generic/475 stresses transaction aborts and can reveal space accounting or use-after-free bugs regarding block goups. In this case the pending block groups that remain linked to the structures after transaction commit aborts in the middle. The corrupted slabs lead to failures in following tests, eg. generic/476 [ 8172.752887] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 [ 8172.755799] #PF error: [normal kernel read fault] [ 8172.757571] PGD 661ae067 P4D 661ae067 PUD 3db8e067 PMD 0 [ 8172.759000] Oops: 0000 [#1] PREEMPT SMP [ 8172.760209] CPU: 0 PID: 39 Comm: kswapd0 Tainted: G W 5.0.0-rc2-default #408 [ 8172.762495] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014 [ 8172.765772] RIP: 0010:shrink_page_list+0x2f9/0xe90 [ 8172.770453] RSP: 0018:ffff967f00663b18 EFLAGS: 00010287 [ 8172.771184] RAX: 0000000000000000 RBX: ffff967f00663c20 RCX: 0000000000000000 [ 8172.772850] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8c0620ab20e0 [ 8172.774629] RBP: ffff967f00663dd8 R08: 0000000000000000 R09: 0000000000000000 [ 8172.776094] R10: ffff8c0620ab22f8 R11: ffff8c063f772688 R12: ffff967f00663b78 [ 8172.777533] R13: ffff8c063f625600 R14: ffff8c063f625608 R15: dead000000000200 [ 8172.778886] FS: 0000000000000000(0000) GS:ffff8c063d400000(0000) knlGS:0000000000000000 [ 8172.780545] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8172.781787] CR2: 0000000000000058 CR3: 000000004e962000 CR4: 00000000000006f0 [ 8172.783547] Call Trace: [ 8172.784112] shrink_inactive_list+0x194/0x410 [ 8172.784747] shrink_node_memcg.constprop.85+0x3a5/0x6a0 [ 8172.785472] shrink_node+0x62/0x1e0 [ 8172.786011] balance_pgdat+0x216/0x460 [ 8172.786577] kswapd+0xe3/0x4a0 [ 8172.787085] ? finish_wait+0x80/0x80 [ 8172.787795] ? balance_pgdat+0x460/0x460 [ 8172.788799] kthread+0x116/0x130 [ 8172.789640] ? kthread_create_on_node+0x60/0x60 [ 8172.790323] ret_from_fork+0x24/0x30 [ 8172.794253] CR2: 0000000000000058 or accounting errors at umount time: [ 8159.537251] WARNING: CPU: 2 PID: 19031 at fs/btrfs/extent-tree.c:5987 btrfs_free_block_groups+0x3d5/0x410 [btrfs] [ 8159.543325] CPU: 2 PID: 19031 Comm: umount Tainted: G W 5.0.0-rc2-default #408 [ 8159.545472] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014 [ 8159.548155] RIP: 0010:btrfs_free_block_groups+0x3d5/0x410 [btrfs] [ 8159.554030] RSP: 0018:ffff967f079cbde8 EFLAGS: 00010206 [ 8159.555144] RAX: 0000000001000000 RBX: ffff8c06366cf800 RCX: 0000000000000000 [ 8159.556730] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffff8c06255ad800 [ 8159.558279] RBP: ffff8c0637ac0000 R08: 0000000000000001 R09: 0000000000000000 [ 8159.559797] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8c0637ac0108 [ 8159.561296] R13: ffff8c0637ac0158 R14: 0000000000000000 R15: dead000000000100 [ 8159.562852] FS: 00007f7f693b9fc0(0000) GS:ffff8c063d800000(0000) knlGS:0000000000000000 [ 8159.564839] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8159.566160] CR2: 00007f7f68fab7b0 CR3: 000000000aec7000 CR4: 00000000000006e0 [ 8159.567898] Call Trace: [ 8159.568597] close_ctree+0x17f/0x350 [btrfs] [ 8159.569628] generic_shutdown_super+0x64/0x100 [ 8159.570808] kill_anon_super+0x14/0x30 [ 8159.571857] btrfs_kill_super+0x12/0xa0 [btrfs] [ 8159.573063] deactivate_locked_super+0x29/0x60 [ 8159.574234] cleanup_mnt+0x3b/0x70 [ 8159.575176] task_work_run+0x98/0xc0 [ 8159.576177] exit_to_usermode_loop+0x83/0x90 [ 8159.577315] do_syscall_64+0x15b/0x180 [ 8159.578339] entry_SYSCALL_64_after_hwframe+0x49/0xbe This fix is based on 2 Josef's patches that used sideefects of btrfs_create_pending_block_groups, this fix introduces the helper that does what we need. CC: stable@vger.kernel.org # 4.4+ CC: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-30btrfs: fix potential oops in device_list_addAl Viro1-2/+2
alloc_fs_devices() can return ERR_PTR(-ENOMEM), so dereferencing its result before the check for IS_ERR() is a bad idea. Fixes: d1a63002829a4 ("btrfs: add members to fs_devices to track fsid changes") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-30blk-mq: fix a hung issue when fsyncJianchao Wang1-1/+1
Florian reported a io hung issue when fsync(). It should be triggered by following race condition. data + post flush a flush blk_flush_complete_seq case REQ_FSEQ_DATA blk_flush_queue_rq issued to driver blk_mq_dispatch_rq_list try to issue a flush req failed due to NON-NCQ command .queue_rq return BLK_STS_DEV_RESOURCE request completion req->end_io // doesn't check RESTART mq_flush_data_end_io case REQ_FSEQ_POSTFLUSH blk_kick_flush do nothing because previous flush has not been completed blk_mq_run_hw_queue insert rq to hctx->dispatch due to RESTART is still set, do nothing To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io with blk_mq_sched_restart to check and clear the RESTART flag. Fixes: bd166ef1 (blk-mq-sched: add framework for MQ capable IO schedulers) Reported-by: Florian Stecker <m19@florianstecker.de> Tested-by: Florian Stecker <m19@florianstecker.de> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-30block: pass no-op callback to INIT_WORK().Tetsuo Handa1-1/+5
syzbot is hitting flush_work() warning caused by commit 4d43d395fed12463 ("workqueue: Try to catch flush_work() without INIT_WORK().") [1]. Although that commit did not expect INIT_WORK(NULL) case, calling flush_work() without setting a valid callback should be avoided anyway. Fix this problem by setting a no-op callback instead of NULL. [1] https://syzkaller.appspot.com/bug?id=e390366bc48bc82a7c668326e0663be3b91cbd29 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-and-tested-by: syzbot <syzbot+ba2a929dcf8e704c180e@syzkaller.appspotmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-30usb: gadget: musb: fix short isoc packets with inventra dmaPaul Elder2-22/+12
Handling short packets (length < max packet size) in the Inventra DMA engine in the MUSB driver causes the MUSB DMA controller to hang. An example of a problem that is caused by this problem is when streaming video out of a UVC gadget, only the first video frame is transferred. For short packets (mode-0 or mode-1 DMA), MUSB_TXCSR_TXPKTRDY must be set manually by the driver. This was previously done in musb_g_tx (musb_gadget.c), but incorrectly (all csr flags were cleared, and only MUSB_TXCSR_MODE and MUSB_TXCSR_TXPKTRDY were set). Fixing that problem allows some requests to be transferred correctly, but multiple requests were often put together in one USB packet, and caused problems if the packet size was not a multiple of 4. Instead, set MUSB_TXCSR_TXPKTRDY in dma_controller_irq (musbhsdma.c), just like host mode transfers. This topic was originally tackled by Nicolas Boichat [0] [1] and is discussed further at [2] as part of his GSoC project [3]. [0] https://groups.google.com/forum/?hl=en#!topic/beagleboard-gsoc/k8Azwfp75CU [1] https://gitorious.org/beagleboard-usbsniffer/beagleboard-usbsniffer-kernel/commit/b0be3b6cc195ba732189b04f1d43ec843c3e54c9?p=beagleboard-usbsniffer:beagleboard-usbsniffer-kernel.git;a=patch;h=b0be3b6cc195ba732189b04f1d43ec843c3e54c9 [2] http://beagleboard-usbsniffer.blogspot.com/2010/07/musb-isochronous-transfers-fixed.html [3] http://elinux.org/BeagleBoard/GSoC/USBSniffer Fixes: 550a7375fe72 ("USB: Add MUSB and TUSB support") Signed-off-by: Paul Elder <paul.elder@ideasonboard.com> Signed-off-by: Bin Liu <b-liu@ti.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30mic: vop: Fix broken virtqueuesVincent Whitchurch1-26/+34
VOP is broken in mainline since commit 1ce9e6055fa0a9043 ("virtio_ring: introduce packed ring support"); attempting to use the virtqueues leads to various kernel crashes. I'm testing it with my not-yet-merged loopback patches, but even the in-tree MIC hardware cannot work. The problem is not in the referenced commit per se, but is due to the following hack in vop_find_vq() which depends on the layout of private structures in other source files, which that commit happened to change: /* * To reassign the used ring here we are directly accessing * struct vring_virtqueue which is a private data structure * in virtio_ring.c. At the minimum, a BUILD_BUG_ON() in * vring_new_virtqueue() would ensure that * (&vq->vring == (struct vring *) (&vq->vq + 1)); */ vr = (struct vring *)(vq + 1); vr->used = used; Fix vop by using __vring_new_virtqueue() to create the needed vring layout from the start, instead of attempting to patch in the used ring later. __vring_new_virtqueue() was added way back in commit 2a2d1382fe9dcc ("virtio: Add improved queue allocation API") in order to address mic's usecase, according to the commit message. Fixes: 1ce9e6055fa0 ("virtio_ring: introduce packed ring support") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30staging: erofs: keep corrupted fs from crashing kernel in erofs_namei()Gao Xiang1-78/+89
As Al pointed out, " ... and while we are at it, what happens to unsigned int nameoff = le16_to_cpu(de[mid].nameoff); unsigned int matched = min(startprfx, endprfx); struct qstr dname = QSTR_INIT(data + nameoff, unlikely(mid >= ndirents - 1) ? maxsize - nameoff : le16_to_cpu(de[mid + 1].nameoff) - nameoff); /* string comparison without already matched prefix */ int ret = dirnamecmp(name, &dname, &matched); if le16_to_cpu(de[...].nameoff) is not monotonically increasing? I.e. what's to prevent e.g. (unsigned)-1 ending up in dname.len? Corrupted fs image shouldn't oops the kernel.. " Revisit the related lookup flow to address the issue. Fixes: d72d1ce60174 ("staging: erofs: add namei functions") Cc: <stable@vger.kernel.org> # 4.19+ Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30staging: octeon: fix broken phylib usageAaro Koskinen1-1/+1
Commit 2b3e88ea6528 ("net: phy: improve phy state checking") added checks for phylib usage, and this triggers with OCTEON ethernet and results in broken networking. Fix by replacing phy_start_aneg() with phy_start(). Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking") Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30mei: free read cb on ctrl_wr list flushAlexander Usyskin1-1/+4
There is a little window during disconnection flow when read cb is moved between lists and may be not freed. Remove moving read cbs explicitly during flash fixes this memory leak. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30samples: mei: use /dev/mei0 instead of /dev/meiTomas Winkler1-1/+1
The device was moved from misc device to character devices to support multiple mei devices. Cc: <stable@vger.kernel.org> #v4.9+ Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30mei: me: add ice lake point device id.Tomas Winkler2-0/+4
Add icelake mei device id. Cc: <stable@vger.kernel.org> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30binderfs: respect limit on binder control creationChristian Brauner1-1/+9
We currently adhere to the reserved devices limit when creating new binderfs devices in binderfs instances not located in the inital ipc namespace. But it is still possible to rob the host instances of their 4 reserved devices by creating the maximum allowed number of devices in a single binderfs instance located in a non-initial ipc namespace and then mounting 4 separate binderfs instances in non-initial ipc namespaces. That happens because the limit is currently not respected for the creation of the initial binder-control device node. Block this nonsense by performing the same check in binderfs_binder_ctl_create() that we perform in binderfs_binder_device_create(). Fixes: 36bdf3cae09d ("binderfs: reserve devices for initial mount") Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30binder: fix CONFIG_ANDROID_BINDER_DEVICESChristian Brauner1-14/+16
Several users have tried to only rely on binderfs to provide binder devices and set CONFIG_ANDROID_BINDER_DEVICES="" empty. This is a great use-case of binderfs and one that was always intended to work. However, this is currently not possible since setting CONFIG_ANDROID_BINDER_DEVICES="" emtpy will simply panic the kernel: kobject: (00000000028c2f79): attempted to be registered with empty name! WARNING: CPU: 7 PID: 1703 at lib/kobject.c:228 kobject_add_internal+0x288/0x2b0 Modules linked in: binder_linux(+) bridge stp llc ipmi_ssif gpio_ich dcdbas coretemp kvm_intel kvm irqbypass serio_raw input_leds lpc_ich i5100_edac mac_hid ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel ib_i CPU: 7 PID: 1703 Comm: modprobe Not tainted 5.0.0-rc2-brauner-binderfs #263 Hardware name: Dell DCS XS24-SC2 /XS24-SC2 , BIOS S59_3C20 04/07/2011 RIP: 0010:kobject_add_internal+0x288/0x2b0 Code: 12 95 48 c7 c7 78 63 3b 95 e8 77 35 71 ff e9 91 fe ff ff 0f 0b eb a7 0f 0b eb 9a 48 89 de 48 c7 c7 00 63 3b 95 e8 f8 95 6a ff <0f> 0b 41 bc ea ff ff ff e9 6d fe ff ff 41 bc fe ff ff ff e9 62 fe RSP: 0018:ffff973f84237a30 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8b53e2472010 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff8b53edbd63a0 RBP: ffff973f84237a60 R08: 0000000000000342 R09: 0000000000000004 R10: ffff973f84237af0 R11: 0000000000000001 R12: 0000000000000000 R13: ffff8b53e9f1a1e0 R14: 00000000e9f1a1e0 R15: 0000000000a00037 FS: 00007fbac36f7540(0000) GS:ffff8b53edbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbac364cfa7 CR3: 00000004a6d48000 CR4: 00000000000406e0 Call Trace: kobject_add+0x71/0xd0 ? _cond_resched+0x19/0x40 ? mutex_lock+0x12/0x40 device_add+0x12e/0x6b0 device_create_groups_vargs+0xe4/0xf0 device_create_with_groups+0x3f/0x60 ? _cond_resched+0x19/0x40 misc_register+0x140/0x180 binder_init+0x1ed/0x2d4 [binder_linux] ? trace_event_define_fields_binder_transaction_fd_send+0x8e/0x8e [binder_linux] do_one_initcall+0x4a/0x1c9 ? _cond_resched+0x19/0x40 ? kmem_cache_alloc_trace+0x151/0x1c0 do_init_module+0x5f/0x216 load_module+0x223d/0x2b20 __do_sys_finit_module+0xfc/0x120 ? __do_sys_finit_module+0xfc/0x120 __x64_sys_finit_module+0x1a/0x20 do_syscall_64+0x5a/0x120 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fbac3202839 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffd1494a908 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000055b629ebec60 RCX: 00007fbac3202839 RDX: 0000000000000000 RSI: 000055b629c20d2e RDI: 0000000000000003 RBP: 000055b629c20d2e R08: 0000000000000000 R09: 000055b629ec2310 R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 R13: 000055b629ebed70 R14: 0000000000040000 R15: 000055b629ebec60 So check for the empty string since strsep() will otherwise return the emtpy string which will cause kobject_add_internal() to panic when trying to add a kobject with an emtpy name. Fixes: ac4812c5ffbb ("binder: Support multiple /dev instances") Cc: Martijn Coenen <maco@google.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Todd Kjos <tkjos@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30selftests: add binderfs selftestsChristian Brauner5-0/+286
This adds the promised selftest for binderfs. It will verify the following things: - binderfs mounting works - binder device allocation works - performing a binder ioctl() request through a binderfs device works - binder device removal works - binder-control removal fails - binderfs unmounting works The tests are performed both privileged and unprivileged. The latter verifies that binderfs behaves correctly in user namespaces. Cc: Todd Kjos <tkjos@google.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Shuah Khan <shuah@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30debugfs: debugfs_lookup() should return NULL if not foundGreg Kroah-Hartman1-5/+5
Lots of callers of debugfs_lookup() were just checking NULL to see if the file/directory was found or not. By changing this in ff9fb72bc077 ("debugfs: return error values, not NULL") we caused some subsystems to easily crash. Fixes: ff9fb72bc077 ("debugfs: return error values, not NULL") Reported-by: syzbot+b382ba6a802a3d242790@syzkaller.appspotmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Omar Sandoval <osandov@fb.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30staging: speakup: fix tty-operation NULL derefsJohan Hovold1-2/+4
The send_xchar() and tiocmset() tty operations are optional. Add the missing sanity checks to prevent user-space triggerable NULL-pointer dereferences. Fixes: 6b9ad1c742bf ("staging: speakup: add send_xchar, tiocmset and input functionality for tty") Cc: stable <stable@vger.kernel.org> # 4.13 Cc: Okash Khawaja <okash.khawaja@gmail.com> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewersBorislav Petkov1-0/+9
... so that they can get CCed on platform patches. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20190128113619.19025-1-bp@alien8.de
2019-01-30serial: sh-sci: Do not free irqs that have already been freedChris Brandt1-1/+8
Since IRQs might be muxed on some parts, we need to pay attention when we are freeing them. Otherwise we get the ugly WARNING "Trying to free already-free IRQ 20". Fixes: 628c534ae735 ("serial: sh-sci: Improve support for separate TEI and DRI interrupts") Cc: stable <stable@vger.kernel.org> Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30serial: 8250_pci: Make PCI class test non fatalAndy Shevchenko1-4/+5
As has been reported the National Instruments serial cards have broken PCI class. The commit 7d8905d06405 ("serial: 8250_pci: Enable device after we check black list") made the PCI class check mandatory for the case when device is listed in a quirk list. Make PCI class test non fatal to allow broken card be enumerated. Fixes: 7d8905d06405 ("serial: 8250_pci: Enable device after we check black list") Cc: stable <stable@vger.kernel.org> Reported-by: Guan Yung Tseng <guan.yung.tseng@ni.com> Tested-by: Guan Yung Tseng <guan.yung.tseng@ni.com> Tested-by: KHUENY.Gerhard <Gerhard.KHUENY@bachmann.info> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30tty: serial: 8250_mtk: Fix potential NULL pointer dereferenceGustavo A. R. Silva1-0/+3
There is a potential NULL pointer dereference in case devm_kzalloc() fails and returns NULL. Fix this by adding a NULL check on data->dma This bug was detected with the help of Coccinelle. Fixes: 85b5c1dd0456 ("serial: 8250-mtk: add uart DMA support") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30Merge branch 'typedef-func_proto'Alexei Starovoitov2-7/+5
Yonghong Song says: ==================== The current btf implementation disallows the typedef of a func_proto type. This actually is allowed per C standard. This patch fixed btf verification to permit such types. Patch #1 fixed the kernel side and Patch #2 fixed the tools test_btf test. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-30tools/bpf: fix test_btf for typedef func_proto caseYonghong Song1-6/+3
Fixed one test_btf raw test such that typedef func_proto is permitted now. Fixes: 78a2540e8945 ("tools/bpf: Add tests for BTF_KIND_FUNC_PROTO and BTF_KIND_FUNC") Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-30bpf: btf: allow typedef func_protoYonghong Song1-1/+2
Current implementation does not allow typedef func_proto. But it is actually allowed. -bash-4.4$ cat t.c typedef int (f) (int); f *g; -bash-4.4$ clang -O2 -g -c -target bpf t.c -Xclang -target-feature -Xclang +dwarfris -bash-4.4$ pahole -JV t.o File t.o: [1] PTR (anon) type_id=2 [2] TYPEDEF f type_id=3 [3] FUNC_PROTO (anon) return=4 args=(4 (anon)) [4] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED -bash-4.4$ This patch related btf verifier to allow such (typedef func_proto) patterns. Fixes: 2667a2626f4d ("bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO") Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-01-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds50-195/+605
Pull networking fixes from David Miller: 1) Need to save away the IV across tls async operations, from Dave Watson. 2) Upon successful packet processing, we should liberate the SKB with dev_consume_skb{_irq}(). From Yang Wei. 3) Only apply RX hang workaround on effected macb chips, from Harini Katakam. 4) Dummy netdev need a proper namespace assigned to them, from Josh Elsasser. 5) Some paths of nft_compat run lockless now, and thus we need to use a proper refcnt_t. From Florian Westphal. 6) Avoid deadlock in mlx5 by doing IRQ locking, from Moni Shoua. 7) netrom does not refcount sockets properly wrt. timers, fix that by using the sock timer API. From Cong Wang. 8) Fix locking of inexact inserts of xfrm policies, from Florian Westphal. 9) Missing xfrm hash generation bump, also from Florian. 10) Missing of_node_put() in hns driver, from Yonglong Liu. 11) Fix DN_IFREQ_SIZE, from Johannes Berg. 12) ip6mr notifier is invoked during traversal of wrong table, from Nir Dotan. 13) TX promisc settings not performed correctly in qed, from Manish Chopra. 14) Fix OOB access in vhost, from Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) MAINTAINERS: Add entry for XDP (eXpress Data Path) net: set default network namespace in init_dummy_netdev() net: b44: replace dev_kfree_skb_xxx by dev_consume_skb_xxx for drop profiles net: caif: call dev_consume_skb_any when skb xmit done net: 8139cp: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles net: macb: Apply RXUBR workaround only to versions with errata net: ti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles net: amd8111e: replace dev_kfree_skb_irq by dev_consume_skb_irq net: alteon: replace dev_kfree_skb_irq by dev_consume_skb_irq net: tls: Fix deadlock in free_resources tx net: tls: Save iv in tls_rec for async crypto requests vhost: fix OOB in get_rx_bufs() qed: Fix stack out of bounds bug qed: Fix system crash in ll2 xmit qed: Fix VF probe failure while FLR qed: Fix LACP pdu drops for VFs qed: Fix bug in tx promiscuous mode settings net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles netfilter: ipt_CLUSTERIP: fix warning unused variable cn ...
2019-01-30CIFS: Do not consider -ENODATA as stat failure for readsPavel Shilovsky1-1/+1
When doing reads beyound the end of a file the server returns error STATUS_END_OF_FILE error which is mapped to -ENODATA. Currently we report it as a failure which confuses read stats. Change it to not consider -ENODATA as failure for stat purposes. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-01-30CIFS: Do not count -ENODATA as failure for query directoryPavel Shilovsky1-2/+2
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-01-30CIFS: Fix trace command logging for SMB2 reads and writesPavel Shilovsky1-16/+30
Currently we log success once we send an async IO request to the server. Instead we need to analyse a response and then log success or failure for a particular command. Also fix argument list for read logging. Cc: <stable@vger.kernel.org> # 4.18 Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-30CIFS: Fix possible oops and memory leaks in async IOPavel Shilovsky1-3/+8
Allocation of a page array for non-cached IO was separated from allocation of rdata and wdata structures and this introduced memory leaks and a possible null pointer dereference. This patch fixes these problems. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-30cifs: limit amount of data we request for xattrs to CIFSMaxBufSizeRonnie Sahlberg2-3/+16
minus the various headers and blobs that will be part of the reply. or else we might trigger a session reconnect. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-01-30cifs: fix computation for MAX_SMB2_HDR_SIZERonnie Sahlberg1-2/+2
The size of the fixed part of the create response is 88 bytes not 56. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-01-30NFS: Fix up return value on fatal errors in nfs_page_async_flush()Trond Myklebust1-4/+5
Ensure that we return the fatal error value that caused us to exit nfs_page_async_flush(). Fixes: c373fff7bd25 ("NFSv4: Don't special case "launder"") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.12+ Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-01-30x86/speculation: Remove redundant arch_smt_update() invocationZhenzhong Duan1-4/+1
With commit a74cfffb03b7 ("x86/speculation: Rework SMT state change"), arch_smt_update() is invoked from each individual CPU hotplug function. Therefore the extra arch_smt_update() call in the sysfs SMT control is redundant. Fixes: a74cfffb03b7 ("x86/speculation: Rework SMT state change") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <konrad.wilk@oracle.com> Cc: <dwmw@amazon.co.uk> Cc: <bp@suse.de> Cc: <srinivas.eeda@oracle.com> Cc: <peterz@infradead.org> Cc: <hpa@zytor.com> Link: https://lkml.kernel.org/r/e2e064f2-e8ef-42ca-bf4f-76b612964752@default
2019-01-29x86/fault: Fix sign-extend unintended sign extensionColin Ian King1-1/+1
show_ldttss() shifts desc.base2 by 24 bit, but base2 is 8 bits of a bitfield in a u16. Due to the really great idea of integer promotion in C99 base2 is promoted to an int, because that's the standard defined behaviour when all values which can be represented by base2 fit into an int. Now if bit 7 is set in desc.base2 the result of the shift left by 24 makes the resulting integer negative and the following conversion to unsigned long legitmately sign extends first causing the upper bits 32 bits to be set in the result. Fix this by casting desc.base2 to unsigned long before the shift. Detected by CoverityScan, CID#1475635 ("Unintended sign extension") [ tglx: Reworded the changelog a bit as I actually had to lookup the standard (again) to decode the original one. ] Fixes: a1a371c468f7 ("x86/fault: Decode page fault OOPSes better") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: kernel-janitors@vger.kernel.org Link: https://lkml.kernel.org/r/20181222191116.21831-1-colin.king@canonical.com
2019-01-29x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning ↵Wei Huang2-1/+9
to long mode In some old AMD KVM implementation, guest's EFER.LME bit is cleared by KVM when the hypervsior detects that the guest sets CR0.PG to 0. This causes the guest OS to reboot when it tries to return from 32-bit trampoline code because the CPU is in incorrect state: CR4.PAE=1, CR0.PG=1, CS.L=1, but EFER.LME=0. As a precaution, set EFER.LME=1 as part of long mode activation procedure. This extra step won't cause any harm when Linux is booted on a bare-metal machine. Signed-off-by: Wei Huang <wei@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: bp@alien8.de Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20190104054411.12489-1-wei@redhat.com
2019-01-29IB/uverbs: Fix OOPs in uverbs_user_mmap_disassociateYishai Hadas1-5/+13
The vma->vm_mm can become impossible to get before rdma_umap_close() is called, in this case we must not try to get an mm that is already undergoing process exit. In this case there is no need to wait for anything as the VMA will be destroyed by another thread soon and is already effectively 'unreachable' by userspace. BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 800000012bc50067 P4D 800000012bc50067 PUD 129db5067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 2050 Comm: bash Tainted: G W OE 4.20.0-rc6+ #3 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__rb_erase_color+0xb9/0x280 Code: 84 17 01 00 00 48 3b 68 10 0f 84 15 01 00 00 48 89 58 08 48 89 de 48 89 ef 4c 89 e3 e8 90 84 22 00 e9 60 ff ff ff 48 8b 5d 10 <f6> 03 01 0f 84 9c 00 00 00 48 8b 43 10 48 85 c0 74 09 f6 00 01 0f RSP: 0018:ffffbecfc090bab8 EFLAGS: 00010246 RAX: ffff97616346cf30 RBX: 0000000000000000 RCX: 0000000000000101 RDX: 0000000000000000 RSI: ffff97623b6ca828 RDI: ffff97621ef10828 RBP: ffff97621ef10828 R08: ffff97621ef10828 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff97623b6ca838 R13: ffffffffbb3fef50 R14: ffff97623b6ca828 R15: 0000000000000000 FS: 00007f7a5c31d740(0000) GS:ffff97623bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000011255a000 CR4: 00000000000006e0 Call Trace: unlink_file_vma+0x3b/0x50 free_pgtables+0xa1/0x110 exit_mmap+0xca/0x1a0 ? mlx5_ib_dealloc_pd+0x28/0x30 [mlx5_ib] mmput+0x54/0x140 uverbs_user_mmap_disassociate+0xcc/0x160 [ib_uverbs] uverbs_destroy_ufile_hw+0xf7/0x120 [ib_uverbs] ib_uverbs_remove_one+0xea/0x240 [ib_uverbs] ib_unregister_device+0xfb/0x200 [ib_core] mlx5_ib_remove+0x51/0xe0 [mlx5_ib] mlx5_remove_device+0xc1/0xd0 [mlx5_core] mlx5_unregister_device+0x3d/0xb0 [mlx5_core] remove_one+0x2a/0x90 [mlx5_core] pci_device_remove+0x3b/0xc0 device_release_driver_internal+0x16d/0x240 unbind_store+0xb2/0x100 kernfs_fop_write+0x102/0x180 __vfs_write+0x36/0x1a0 ? __alloc_fd+0xa9/0x170 ? set_close_on_exec+0x49/0x70 vfs_write+0xad/0x1a0 ksys_write+0x52/0xc0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Cc: <stable@vger.kernel.org> # 4.19 Fixes: 5f9794dc94f5 ("RDMA/ucontext: Add a core API for mmaping driver IO memory") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29debugfs: return error values, not NULLGreg Kroah-Hartman1-17/+22
When an error happens, debugfs should return an error pointer value, not NULL. This will prevent the totally theoretical error where a debugfs call fails due to lack of memory, returning NULL, and that dentry value is then passed to another debugfs call, which would end up succeeding, creating a file at the root of the debugfs tree, but would then be impossible to remove (because you can not remove the directory NULL). So, to make everyone happy, always return errors, this makes the users of debugfs much simpler (they do not have to ever check the return value), and everyone can rest easy. Reported-by: Gary R Hook <ghook@amd.com> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Michal Hocko <mhocko@kernel.org> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reported-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>