summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-12-19binder: fix use-after-free due to ksys_close() during fdget()Todd Kjos3-2/+91
44d8047f1d8 ("binder: use standard functions to allocate fds") exposed a pre-existing issue in the binder driver. fdget() is used in ksys_ioctl() as a performance optimization. One of the rules associated with fdget() is that ksys_close() must not be called between the fdget() and the fdput(). There is a case where this requirement is not met in the binder driver which results in the reference count dropping to 0 when the device is still in use. This can result in use-after-free or other issues. If userpace has passed a file-descriptor for the binder driver using a BINDER_TYPE_FDA object, then kys_close() is called on it when handling a binder_ioctl(BC_FREE_BUFFER) command. This violates the assumptions for using fdget(). The problem is fixed by deferring the close using task_work_add(). A new variant of __close_fd() was created that returns a struct file with a reference. The fput() is deferred instead of using ksys_close(). Fixes: 44d8047f1d87a ("binder: use standard functions to allocate fds") Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Todd Kjos <tkjos@google.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-11Merge tag 'extcon-next-for-4.21' of ↵Greg Kroah-Hartman4-15/+59
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next Chanwoo writes: Update extcon for 4.21 Detailed description for this pull request: 1. Fix minor issue of Maxim MUIC (Micro USB IC) device driver - Avoid forcing UART path on probe for extcon-max77843/77693/14577/8997.c - Set USB path in USB device mode for extcon-max8997.c * tag 'extcon-next-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon: extcon: max8997: Fix lack of path setting in USB device mode extcon: max8997: Avoid forcing UART path on drive probe extcon: max14577: Avoid forcing UART path on drive probe extcon: max77693: Avoid forcing UART path on drive probe extcon: max77843: Avoid forcing UART path on drive probe
2018-12-10Merge tag 'thunderbolt-for-v4.21' of ↵Greg Kroah-Hartman10-3/+185
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into char-misc-next Mika writes: thunderbolt: Changes for v4.21 merge window * tag 'thunderbolt-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: thunderbolt: Export IOMMU based DMA protection support to userspace iommu/vt-d: Do not enable ATS for untrusted devices iommu/vt-d: Force IOMMU on for platform opt in hint PCI / ACPI: Identify untrusted PCI devices
2018-12-10Merge 4.20-rc6 into char-misc-nextGreg Kroah-Hartman294-1264/+3083
This should resolve the hv driver merge conflict. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-10Linux 4.20-rc6Linus Torvalds1-1/+1
2018-12-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds83-368/+1225
Pull networking fixes from David Miller: "A decent batch of fixes here. I'd say about half are for problems that have existed for a while, and half are for new regressions added in the 4.20 merge window. 1) Fix 10G SFP phy module detection in mvpp2, from Baruch Siach. 2) Revert bogus emac driver change, from Benjamin Herrenschmidt. 3) Handle BPF exported data structure with pointers when building 32-bit userland, from Daniel Borkmann. 4) Memory leak fix in act_police, from Davide Caratti. 5) Check RX checksum offload in RX descriptors properly in aquantia driver, from Dmitry Bogdanov. 6) SKB unlink fix in various spots, from Edward Cree. 7) ndo_dflt_fdb_dump() only works with ethernet, enforce this, from Eric Dumazet. 8) Fix FID leak in mlxsw driver, from Ido Schimmel. 9) IOTLB locking fix in vhost, from Jean-Philippe Brucker. 10) Fix SKB truesize accounting in ipv4/ipv6/netfilter frag memory limits otherwise namespace exit can hang. From Jiri Wiesner. 11) Address block parsing length fixes in x25 from Martin Schiller. 12) IRQ and ring accounting fixes in bnxt_en, from Michael Chan. 13) For tun interfaces, only iface delete works with rtnl ops, enforce this by disallowing add. From Nicolas Dichtel. 14) Use after free in liquidio, from Pan Bian. 15) Fix SKB use after passing to netif_receive_skb(), from Prashant Bhole. 16) Static key accounting and other fixes in XPS from Sabrina Dubroca. 17) Partially initialized flow key passed to ip6_route_output(), from Shmulik Ladkani. 18) Fix RTNL deadlock during reset in ibmvnic driver, from Thomas Falcon. 19) Several small TCP fixes (off-by-one on window probe abort, NULL deref in tail loss probe, SNMP mis-estimations) from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits) net/sched: cls_flower: Reject duplicated rules also under skip_sw bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips. bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips. bnxt_en: Keep track of reserved IRQs. bnxt_en: Fix CNP CoS queue regression. net/mlx4_core: Correctly set PFC param if global pause is turned off. Revert "net/ibm/emac: wrong bit is used for STA control" neighbour: Avoid writing before skb->head in neigh_hh_output() ipv6: Check available headroom in ip6_xmit() even without options tcp: lack of available data can also cause TSO defer ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl mlxsw: spectrum_router: Relax GRE decap matching check mlxsw: spectrum_switchdev: Avoid leaking FID's reference count mlxsw: spectrum_nve: Remove easily triggerable warnings ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes sctp: frag_point sanity check tcp: fix NULL ref in tail loss probe tcp: Do not underestimate rwnd_limited net: use skb_list_del_init() to remove from RX sublists ...
2018-12-10Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds3-5/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Three fixes: a boot parameter re-(re-)fix, a retpoline build artifact fix and an LLVM workaround" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Drop implicit common-page-size linker flag x86/build: Fix compiler support check for CONFIG_RETPOLINE x86/boot: Clear RSDP address in boot_params for broken loaders
2018-12-10Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kprobes fixes from Ingo Molnar: "Two kprobes fixes: a blacklist fix and an instruction patching related corruption fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kprobes/x86: Blacklist non-attachable interrupt functions kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction
2018-12-10Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds2-25/+42
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "Two fixes: a large-system fix and an earlyprintk fix with certain resolutions" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/earlyprintk/efi: Fix infinite loop on some screen widths x86/efi: Allocate e820 buffer before calling efi_exit_boot_service
2018-12-09net/sched: cls_flower: Reject duplicated rules also under skip_swOr Gerlitz1-13/+10
Currently, duplicated rules are rejected only for skip_hw or "none", hence allowing users to push duplicates into HW for no reason. Use the flower tables to protect for that. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reported-by: Chris Mi <chrism@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09Merge branch 'bnxt_en-Bug-fixes'David S. Miller3-13/+50
Michael Chan says: ==================== bnxt_en: Bug fixes. The first patch fixes a regression on CoS queue setup, introduced recently by the 57500 new chip support patches. The rest are fixes related to ring and resource accounting on the new 57500 chips. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips.Michael Chan1-4/+12
The CP rings are accounted differently on the new 57500 chips. There must be enough CP rings for the sum of RX and TX rings on the new chips. The current logic may be over-estimating the RX and TX rings. The output parameter max_cp should be the maximum NQs capped by MSIX vectors available for networking in the context of 57500 chips. The existing code which uses CMPL rings capped by the MSIX vectors works most of the time but is not always correct. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips.Michael Chan1-6/+23
The new 57500 chips have introduced the NQ structure in addition to the existing CP rings in all chips. We need to introduce a new bnxt_nq_rings_in_use(). On legacy chips, the 2 functions are the same and one will just call the other. On the new chips, they refer to the 2 separate ring structures. The new function is now called to determine the resource (NQ or CP rings) associated with MSIX that are in use. On 57500 chips, the RDMA driver does not use the CP rings so we don't need to do the subtraction adjustment. Fixes: 41e8d7983752 ("bnxt_en: Modify the ring reservation functions for 57500 series chips.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09bnxt_en: Keep track of reserved IRQs.Michael Chan3-3/+8
The new 57500 chips use 1 NQ per MSIX vector, whereas legacy chips use 1 CP ring per MSIX vector. To better unify this, add a resv_irqs field to struct bnxt_hw_resc. On legacy chips, we initialize resv_irqs with resv_cp_rings. On new chips, we initialize it with the allocated MSIX resources. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09bnxt_en: Fix CNP CoS queue regression.Michael Chan1-0/+7
Recent changes to support the 57500 devices have created this regression. The bnxt_hwrm_queue_qportcfg() call was moved to be called earlier before the RDMA support was determined, causing the CoS queues configuration to be set before knowing whether RDMA was supported or not. Fix it by moving it to the right place right after RDMA support is determined. Fixes: 98f04cf0f1fc ("bnxt_en: Check context memory requirements from firmware.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09Merge tag 'char-misc-4.20-rc6' of ↵Linus Torvalds6-69/+166
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small driver fixes for 4.20-rc6. There is a hyperv fix that for some reaon took forever to get into a shape that could be applied to the tree properly, but resolves a much reported issue. The others are some gnss patches, one a bugfix and the two others updates to the MAINTAINERS file to properly match the gnss files in the tree. All have been in linux-next for a while with no reported issues" * tag 'char-misc-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: MAINTAINERS: exclude gnss from SIRFPRIMA2 regex matching MAINTAINERS: add gnss scm tree gnss: sirf: fix activation retry handling Drivers: hv: vmbus: Offload the handling of channels to two workqueues
2018-12-09Merge tag 'staging-4.20-rc6' of ↵Linus Torvalds3-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging fixes from Greg KH: "Here are two staging driver bugfixes for 4.20-rc6. One is a revert of a previously incorrect patch that was merged a while ago, and the other resolves a possible buffer overrun that was found by code inspection. Both of these have been in the linux-next tree with no reported issues" * tag 'staging-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: Revert commit ef9209b642f "staging: rtl8723bs: Fix indenting errors and an off-by-one mistake in core/rtw_mlme_ext.c" staging: rtl8712: Fix possible buffer overrun
2018-12-09Merge tag 'tty-4.20-rc6' of ↵Linus Torvalds3-12/+11
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty driver fixes from Greg KH: "Here are three small tty driver fixes for 4.20-rc6 Nothing major, just some bug fixes for reported issues. Full details are in the shortlog. All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: kgdboc: fix KASAN global-out-of-bounds bug in param_set_kgdboc_var() tty: serial: 8250_mtk: always resume the device in probe. tty: do not set TTY_IO_ERROR flag if console port
2018-12-09Merge tag 'usb-4.20-rc6' of ↵Linus Torvalds12-15/+70
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for 4.20-rc6 The "largest" here are some xhci fixes for reported issues. Also here is a USB core fix, some quirk additions, and a usb-serial fix which required the export of one of the tty layer's functions to prevent code duplication. The tty maintainer agreed with this change. All of these have been in linux-next with no reported issues" * tag 'usb-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: Prevent U1/U2 link pm states if exit latency is too long xhci: workaround CSS timeout on AMD SNPS 3.0 xHC USB: check usb_get_extra_descriptor for proper size USB: serial: console: fix reported terminal settings usb: quirk: add no-LPM quirk on SanDisk Ultra Flair device USB: Fix invalid-free bug in port_over_current_notify() usb: appledisplay: Add 27" Apple Cinema Display
2018-12-09Merge tag '4.20-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds3-27/+8
Pull cifs fixes from Steve French: "Three small fixes: a fix for smb3 direct i/o, a fix for CIFS DFS for stable and a minor cifs Kconfig fix" * tag '4.20-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Avoid returning EBUSY to upper layer VFS cifs: Fix separator when building path from dentry cifs: In Kconfig CONFIG_CIFS_POSIX needs depends on legacy (insecure cifs)
2018-12-09Merge tag 'dax-fixes-4.20-rc6' of ↵Linus Torvalds3-25/+50
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull dax fixes from Dan Williams: "The last of the known regression fixes and fallout from the Xarray conversion of the filesystem-dax implementation. On the path to debugging why the dax memory-failure injection test started failing after the Xarray conversion a couple more fixes for the dax_lock_mapping_entry(), now called dax_lock_page(), surfaced. Those plus the bug that started the hunt are now addressed. These patches have appeared in a -next release with no issues reported. Note the touches to mm/memory-failure.c are just the conversion to the new function signature for dax_lock_page(). Summary: - Fix the Xarray conversion of fsdax to properly handle dax_lock_mapping_entry() in the presense of pmd entries - Fix inode destruction racing a new lock request" * tag 'dax-fixes-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: Fix unlock mismatch with updated API dax: Don't access a freed inode dax: Check page->mapping isn't NULL
2018-12-09Merge tag 'libnvdimm-fixes-4.20-rc6' of ↵Linus Torvalds5-30/+114
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A regression fix for the Address Range Scrub implementation, yes another one, and support for platforms that misalign persistent memory relative to the Linux memory hotplug section constraint. Longer term, support for sub-section memory hotplug would alleviate alignment waste, but until then this hack allows a 'struct page' memmap to be established for these misaligned memory regions. These have all appeared in a -next release, and thanks to Patrick for reporting and testing the alignment padding fix. Summary: - Unless and until the core mm handles memory hotplug units smaller than a section (128M), persistent memory namespaces must be padded to section alignment. The libnvdimm core already handled section collision with "System RAM", but some configurations overlap independent "Persistent Memory" ranges within a section, so additional padding injection is added for that case. - The recent reworks of the ARS (address range scrub) state machine to reduce the number of state flags inadvertantly missed a conversion of acpi_nfit_ars_rescan() call sites. Fix the regression whereby user-requested ARS results in a "short" scrub rather than a "long" scrub. - Fixup the unit tests to handle / test the 128M section alignment of mocked test resources. * tag 'libnvdimm-fixes-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: acpi/nfit: Fix user-initiated ARS to be "ARS-long" rather than "ARS-short" libnvdimm, pfn: Pad pfn namespaces relative to other regions tools/testing/nvdimm: Align test resources to 128M
2018-12-09net/mlx4_core: Correctly set PFC param if global pause is turned off.Tarick Bedeir1-2/+2
rx_ppp and tx_ppp can be set between 0 and 255, so don't clamp to 1. Fixes: 6e8814ceb7e8 ("net/mlx4_en: Fix mixed PFC and Global pause user control requests") Signed-off-by: Tarick Bedeir <tarick@google.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09Merge branch 'fixes' of ↵Linus Torvalds3-26/+15
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal SoC fixes from Eduardo Valentin: "Fixes for armada and broadcom thermal drivers" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: broadcom: constify thermal_zone_of_device_ops structure thermal: armada: constify thermal_zone_of_device_ops structure thermal: bcm2835: Switch to SPDX identifier thermal: armada: fix legacy resource fixup thermal: armada: fix legacy validity test sense
2018-12-08Merge tag 'asm-generic-4.20' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic fix from Arnd Bergmann: "Multiple people reported a bug I introduced in asm-generic/unistd.h in 4.20, this is the obvious bugfix to get glibc and others to correctly build again on new architectures that no longer provide the old fstatat64() family of system calls" * tag 'asm-generic-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: unistd.h: fixup broken macro include.
2018-12-08Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds6-4/+47
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A few clk driver fixes this time: - Introduce protected-clock DT binding to fix breakage on qcom sdm845-mtp boards where the qspi clks introduced this merge window cause the firmware on those boards to take down the system if we try to read the clk registers - Fix a couple off-by-one errors found by Dan Carpenter - Handle failure in zynq fixed factor clk driver to avoid using uninitialized data" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: zynqmp: Off by one in zynqmp_is_valid_clock() clk: mmp: Off by one in mmp_clk_add() clk: mvebu: Off by one bugs in cp110_of_clk_get() arm64: dts: qcom: sdm845-mtp: Mark protected gcc clocks clk: qcom: Support 'protected-clocks' property dt-bindings: clk: Introduce 'protected-clocks' property clk: zynqmp: handle fixed factor param query error
2018-12-08Merge tag 'xfs-4.20-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds6-15/+11
Pull xfs fixes from Darrick Wong: "Here are hopefully the last set of fixes for 4.20. There's a fix for a longstanding statfs reporting problem with project quotas, a correction for page cache invalidation behaviors when fallocating near EOF, and a fix for a broken metadata verifier return code. Finally, the most important fix is to the pipe splicing code (aka the generic copy_file_range fallback) to avoid pointless short directio reads by only asking the filesystem for as much data as there are available pages in the pipe buffer. Our previous fix (simulated short directio reads because the number of pages didn't match the length of the read requested) caused subtle problems on overlayfs, so that part is reverted. Anyhow, this series passes fstests -g all on xfs and overlay+xfs, and has passed 17 billion fsx operations problem-free since I started testing Summary: - Fix broken project quota inode counts - Fix incorrect PAGE_MASK/PAGE_SIZE usage - Fix incorrect return value in btree verifier - Fix WARN_ON remap flags false positive - Fix splice read overflows" * tag 'xfs-4.20-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: partially revert 4721a601099 (simulated directio short read on EFAULT) splice: don't read more than available pipe space vfs: allow some remap flags to be passed to vfs_clone_file_range xfs: fix inverted return from xfs_btree_sblock_verify_crc xfs: fix PAGE_MASK usage in xfs_free_file_space fs/xfs: fix f_ffree value for statfs when project quota is set
2018-12-08Revert "mm, thp: consolidate THP gfp handling into ↵David Rientjes4-22/+51
alloc_hugepage_direct_gfpmask" This reverts commit 89c83fb539f95491be80cdd5158e6f0ce329e317. This should have been done as part of 2f0799a0ffc0 ("mm, thp: restore node-local hugepage allocations"). The movement of the thp allocation policy from alloc_pages_vma() to alloc_hugepage_direct_gfpmask() was intended to only set __GFP_THISNODE for mempolicies that are not MPOL_BIND whereas the revert could set this regardless of mempolicy. While the check for MPOL_BIND between alloc_hugepage_direct_gfpmask() and alloc_pages_vma() was racy, that has since been removed since the revert. What is left is the possibility to use __GFP_THISNODE in policy_node() when it is unexpected because the special handling for hugepages in alloc_pages_vma() was removed as part of the consolidation. Secondly, prior to 89c83fb539f9, alloc_pages_vma() implemented a somewhat different policy for hugepage allocations, which were allocated through alloc_hugepage_vma(). For hugepage allocations, if the allocating process's node is in the set of allowed nodes, allocate with __GFP_THISNODE for that node (for MPOL_PREFERRED, use that node with __GFP_THISNODE instead). This was changed for shmem_alloc_hugepage() to allow fallback to other nodes in 89c83fb539f9 as it did for new_page() in mm/mempolicy.c which is functionally different behavior and removes the requirement to only allocate hugepages locally. So this commit does a full revert of 89c83fb539f9 instead of the partial revert that was done in 2f0799a0ffc0. The result is the same thp allocation policy for 4.20 that was in 4.19. Fixes: 89c83fb539f9 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask") Fixes: 2f0799a0ffc0 ("mm, thp: restore node-local hugepage allocations") Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-08Revert "net/ibm/emac: wrong bit is used for STA control"Benjamin Herrenschmidt1-1/+1
This reverts commit 624ca9c33c8a853a4a589836e310d776620f4ab9. This commit is completely bogus. The STACR register has two formats, old and new, depending on the version of the IP block used. There's a pair of device-tree properties that can be used to specify the format used: has-inverted-stacr-oc has-new-stacr-staopc What this commit did was to change the bit definition used with the old parts to match the new parts. This of course breaks the driver on all the old ones. Instead, the author should have set the appropriate properties in the device-tree for the variant used on his board. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08Merge branch 'skb-headroom-slab-out-of-bounds'David S. Miller2-26/+44
Stefano Brivio says: ==================== Fix slab out-of-bounds on insufficient headroom for IPv6 packets Patch 1/2 fixes a slab out-of-bounds occurring with short SCTP packets over IPv4 over L2TP over IPv6 on a configuration with relatively low HEADER_MAX. Patch 2/2 makes sure we avoid writing before the allocated buffer in neigh_hh_output() in case the headroom is enough for the unaligned hardware header size, but not enough for the aligned one, and that we warn if we hit this condition. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08neighbour: Avoid writing before skb->head in neigh_hh_output()Stefano Brivio1-5/+23
While skb_push() makes the kernel panic if the skb headroom is less than the unaligned hardware header size, it will proceed normally in case we copy more than that because of alignment, and we'll silently corrupt adjacent slabs. In the case fixed by the previous patch, "ipv6: Check available headroom in ip6_xmit() even without options", we end up in neigh_hh_output() with 14 bytes headroom, 14 bytes hardware header and write 16 bytes, starting 2 bytes before the allocated buffer. Always check we're not writing before skb->head and, if the headroom is not enough, warn and drop the packet. v2: - instead of panicking with BUG_ON(), WARN_ON_ONCE() and drop the packet (Eric Dumazet) - if we avoid the panic, though, we need to explicitly check the headroom before the memcpy(), otherwise we'll have corrupted slabs on a running kernel, after we warn - use __skb_push() instead of skb_push(), as the headroom check is already implemented here explicitly (Eric Dumazet) Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08ipv6: Check available headroom in ip6_xmit() even without optionsStefano Brivio1-21/+21
Even if we send an IPv6 packet without options, MAX_HEADER might not be enough to account for the additional headroom required by alignment of hardware headers. On a configuration without HYPERV_NET, WLAN, AX25, and with IPV6_TUNNEL, sending short SCTP packets over IPv4 over L2TP over IPv6, we start with 100 bytes of allocated headroom in sctp_packet_transmit(), end up with 54 bytes after l2tp_xmit_skb(), and 14 bytes in ip6_finish_output2(). Those would be enough to append our 14 bytes header, but we're going to align that to 16 bytes, and write 2 bytes out of the allocated slab in neigh_hh_output(). KASan says: [ 264.967848] ================================================================== [ 264.967861] BUG: KASAN: slab-out-of-bounds in ip6_finish_output2+0x1aec/0x1c70 [ 264.967866] Write of size 16 at addr 000000006af1c7fe by task netperf/6201 [ 264.967870] [ 264.967876] CPU: 0 PID: 6201 Comm: netperf Not tainted 4.20.0-rc4+ #1 [ 264.967881] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0) [ 264.967887] Call Trace: [ 264.967896] ([<00000000001347d6>] show_stack+0x56/0xa0) [ 264.967903] [<00000000017e379c>] dump_stack+0x23c/0x290 [ 264.967912] [<00000000007bc594>] print_address_description+0xf4/0x290 [ 264.967919] [<00000000007bc8fc>] kasan_report+0x13c/0x240 [ 264.967927] [<000000000162f5e4>] ip6_finish_output2+0x1aec/0x1c70 [ 264.967935] [<000000000163f890>] ip6_finish_output+0x430/0x7f0 [ 264.967943] [<000000000163fe44>] ip6_output+0x1f4/0x580 [ 264.967953] [<000000000163882a>] ip6_xmit+0xfea/0x1ce8 [ 264.967963] [<00000000017396e2>] inet6_csk_xmit+0x282/0x3f8 [ 264.968033] [<000003ff805fb0ba>] l2tp_xmit_skb+0xe02/0x13e0 [l2tp_core] [ 264.968037] [<000003ff80631192>] l2tp_eth_dev_xmit+0xda/0x150 [l2tp_eth] [ 264.968041] [<0000000001220020>] dev_hard_start_xmit+0x268/0x928 [ 264.968069] [<0000000001330e8e>] sch_direct_xmit+0x7ae/0x1350 [ 264.968071] [<000000000122359c>] __dev_queue_xmit+0x2b7c/0x3478 [ 264.968075] [<00000000013d2862>] ip_finish_output2+0xce2/0x11a0 [ 264.968078] [<00000000013d9b14>] ip_finish_output+0x56c/0x8c8 [ 264.968081] [<00000000013ddd1e>] ip_output+0x226/0x4c0 [ 264.968083] [<00000000013dbd6c>] __ip_queue_xmit+0x894/0x1938 [ 264.968100] [<000003ff80bc3a5c>] sctp_packet_transmit+0x29d4/0x3648 [sctp] [ 264.968116] [<000003ff80b7bf68>] sctp_outq_flush_ctrl.constprop.5+0x8d0/0xe50 [sctp] [ 264.968131] [<000003ff80b7c716>] sctp_outq_flush+0x22e/0x7d8 [sctp] [ 264.968146] [<000003ff80b35c68>] sctp_cmd_interpreter.isra.16+0x530/0x6800 [sctp] [ 264.968161] [<000003ff80b3410a>] sctp_do_sm+0x222/0x648 [sctp] [ 264.968177] [<000003ff80bbddac>] sctp_primitive_ASSOCIATE+0xbc/0xf8 [sctp] [ 264.968192] [<000003ff80b93328>] __sctp_connect+0x830/0xc20 [sctp] [ 264.968208] [<000003ff80bb11ce>] sctp_inet_connect+0x2e6/0x378 [sctp] [ 264.968212] [<0000000001197942>] __sys_connect+0x21a/0x450 [ 264.968215] [<000000000119aff8>] sys_socketcall+0x3d0/0xb08 [ 264.968218] [<000000000184ea7a>] system_call+0x2a2/0x2c0 [...] Just like ip_finish_output2() does for IPv4, check that we have enough headroom in ip6_xmit(), and reallocate it if we don't. This issue is older than git history. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08tcp: lack of available data can also cause TSO deferEric Dumazet1-11/+24
tcp_tso_should_defer() can return true in three different cases : 1) We are cwnd-limited 2) We are rwnd-limited 3) We are application limited. Neal pointed out that my recent fix went too far, since it assumed that if we were not in 1) case, we must be rwnd-limited Fix this by properly populating the is_cwnd_limited and is_rwnd_limited booleans. After this change, we can finally move the silly check for FIN flag only for the application-limited case. The same move for EOR bit will be handled in net-next, since commit 1c09f7d073b1 ("tcp: do not try to defer skbs with eor mark (MSG_EOR)") is scheduled for linux-4.21 Tested by running 200 concurrent netperf -t TCP_RR -- -r 60000,100 and checking none of them was rwnd_limited in the chrono_stat output from "ss -ti" command. Fixes: 41727549de3e ("tcp: Do not underestimate rwnd_limited") Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-34/+62
Pull vhost/virtio fixes from Michael Tsirkin: "A couple of last-minute fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost/vsock: fix use-after-free in network stack callers virtio/s390: fix race in ccw_io_helper() virtio/s390: avoid race on vcdev->config vhost/vsock: fix reset orphans race with close timeout
2018-12-08Merge tag 'arm64-fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Avoid sending IPIs with interrupts disabled" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: hibernate: Avoid sending cross-calling with interrupts disabled
2018-12-08Merge tag 'gcc-plugins-v4.20-rc6' of ↵Linus Torvalds2-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc stackleak plugin fixes from Kees Cook: - Remove tracing for inserted stack depth marking function (Anders Roxell) - Move gcc-plugin pass location to avoid objtool warnings (Alexander Popov) * tag 'gcc-plugins-v4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: stackleak: Register the 'stackleak_cleanup' pass before the '*free_cfg' pass stackleak: Mark stackleak_track_stack() as notrace
2018-12-08Merge branch 'linus' of ↵Linus Torvalds4-7/+13
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Disable the new crypto stats interface as it's still being changed - Fix potential uses-after-free in cbc/cfb/pcbc. * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: user - Disable statistics interface crypto: do not free algorithm before using
2018-12-07Merge tag 'pci-v4.20-fixes-3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "Revert ASPM change that caused a regression" * tag 'pci-v4.20-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "PCI/ASPM: Do not initialize link state when aspm_disabled is set"
2018-12-07ipv6: sr: properly initialize flowi6 prior passing to ip6_route_outputShmulik Ladkani1-0/+1
In 'seg6_output', stack variable 'struct flowi6 fl6' was missing initialization. Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07Merge tag 'for-linus-20181207' of git://git.kernel.dk/linux-blockLinus Torvalds6-55/+123
Pull block fixes from Jens Axboe: "Let's try this again... We're finally happy with the DM livelock issue, and it's also passed overnight testing and the corruption regression test. The end result is much nicer now too, which is great. Outside of that fix, there's a pull request for NVMe with two small fixes, and a regression fix for BFQ from this merge window. The BFQ fix looks bigger than it is, it's 90% comment updates" * tag 'for-linus-20181207' of git://git.kernel.dk/linux-block: blk-mq: punt failed direct issue to dispatch list nvmet-rdma: fix response use after free nvme: validate controller state before rescheduling keep alive block, bfq: fix decrement of num_active_groups
2018-12-07Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds6-30/+93
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "A set of driver bugfixes for the I2C subsystem" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: uniphier-f: fix violation of tLOW requirement for Fast-mode i2c: uniphier: fix violation of tLOW requirement for Fast-mode i2c: uniphier-f: fill TX-FIFO only in IRQ handler for repeated START i2c: uniphier-f: fix timeout error after reading 8 bytes i2c: scmi: Fix probe error on devices with an empty SMB0001 ACPI device node i2c: axxia: properly handle master timeout i2c: rcar: check bus state before reinitializing i2c: nvidia-gpu: limit reads also for combined messages i2c: nvidia-gpu: adhere to I2C fault codes
2018-12-07Merge tag 'dmaengine-fix-4.20-rc6' of ↵Linus Torvalds3-29/+62
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "Another pull request for dmaengine. We got bunch of fixes early this week and all are tagged to stable. Hope this is last fix for this cycle: - Fix imx-sdma handling of channel terminations, this involves reverting two commits and implement async termination - Fix cppi dma channel deletion from pending list on stop - Fix FIFO size for dw controller in Intel Merrifield" * tag 'dmaengine-fix-4.20-rc6' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: dw: Fix FIFO size for Intel Merrifield dmaengine: cppi41: delete channel from pending list when stop channel dmaengine: imx-sdma: use GFP_NOWAIT for dma descriptor allocations dmaengine: imx-sdma: implement channel termination via worker Revert "dmaengine: imx-sdma: alloclate bd memory from dma pool" Revert "dmaengine: imx-sdma: Use GFP_NOWAIT for dma allocations"
2018-12-07x86/vdso: Drop implicit common-page-size linker flagNick Desaulniers1-2/+2
GNU linker's -z common-page-size's default value is based on the target architecture. arch/x86/entry/vdso/Makefile sets it to the architecture default, which is implicit and redundant. Drop it. Fixes: 2aae950b21e4 ("x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu") Reported-by: Dmitry Golovin <dima@golovin.in> Reported-by: Bill Wendling <morbo@google.com> Suggested-by: Dmitry Golovin <dima@golovin.in> Suggested-by: Rui Ueyama <ruiu@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Fangrui Song <maskray@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20181206191231.192355-1-ndesaulniers@google.com Link: https://bugs.llvm.org/show_bug.cgi?id=38774 Link: https://github.com/ClangBuiltLinux/linux/issues/31
2018-12-07arm64: hibernate: Avoid sending cross-calling with interrupts disabledWill Deacon1-1/+1
Since commit 3b8c9f1cdfc50 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings"), a call to flush_icache_range() will use an IPI to cross-call other online CPUs so that any stale instructions are flushed from their pipelines. This triggers a WARN during the hibernation resume path, where flush_icache_range() is called with interrupts disabled and is therefore prone to deadlock: | Disabling non-boot CPUs ... | CPU1: shutdown | psci: CPU1 killed. | CPU2: shutdown | psci: CPU2 killed. | CPU3: shutdown | psci: CPU3 killed. | WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350 | Modules linked in: | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1 Since all secondary CPUs have been taken offline prior to invalidating the I-cache, there's actually no need for an IPI and we can simply call __flush_icache_range() instead. Cc: <stable@vger.kernel.org> Fixes: 3b8c9f1cdfc50 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings") Reported-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Tested-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Tested-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-12-07Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-linusJens Axboe2-2/+11
Pull NVMe fixes from Christoph. * 'nvme-4.20' of git://git.infradead.org/nvme: nvmet-rdma: fix response use after free nvme: validate controller state before rescheduling keep alive
2018-12-07blk-mq: punt failed direct issue to dispatch listJens Axboe1-28/+5
After the direct dispatch corruption fix, we permanently disallow direct dispatch of non read/write requests. This works fine off the normal IO path, as they will be retried like any other failed direct dispatch request. But for the blk_insert_cloned_request() that only DM uses to bypass the bottom level scheduler, we always first attempt direct dispatch. For some types of requests, that's now a permanent failure, and no amount of retrying will make that succeed. This results in a livelock. Instead of making special cases for what we can direct issue, and now having to deal with DM solving the livelock while still retaining a BUSY condition feedback loop, always just add a request that has been through ->queue_rq() to the hardware queue dispatch list. These are safe to use as no merging can take place there. Additionally, if requests do have prepped data from drivers, we aren't dependent on them not sharing space in the request structure to safely add them to the IO scheduler lists. This basically reverts ffe81d45322c and is based on a patch from Ming, but with the list insert case covered as well. Fixes: ffe81d45322c ("blk-mq: fix corruption with direct issue") Cc: stable@vger.kernel.org Suggested-by: Ming Lei <ming.lei@redhat.com> Reported-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Ming Lei <ming.lei@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet-rdma: fix response use after freeIsrael Rukshin1-1/+2
nvmet_rdma_release_rsp() may free the response before using it at error flow. Fixes: 8407879 ("nvmet-rdma: fix possible bogus dereference under heavy load") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-07nvme: validate controller state before rescheduling keep aliveJames Smart1-1/+9
Delete operations are seeing NULL pointer references in call_timer_fn. Tracking these back, the timer appears to be the keep alive timer. nvme_keep_alive_work() which is tied to the timer that is cancelled by nvme_stop_keep_alive(), simply starts the keep alive io but doesn't wait for it's completion. So nvme_stop_keep_alive() only stops a timer when it's pending. When a keep alive is in flight, there is no timer running and the nvme_stop_keep_alive() will have no affect on the keep alive io. Thus, if the io completes successfully, the keep alive timer will be rescheduled. In the failure case, delete is called, the controller state is changed, the nvme_stop_keep_alive() is called while the io is outstanding, and the delete path continues on. The keep alive happens to successfully complete before the delete paths mark it as aborted as part of the queue termination, so the timer is restarted. The delete paths then tear down the controller, and later on the timer code fires and the timer entry is now corrupt. Fix by validating the controller state before rescheduling the keep alive. Testing with the fix has confirmed the condition above was hit. Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-07block, bfq: fix decrement of num_active_groupsPaolo Valente3-25/+107
Since commit '2d29c9f89fcd ("block, bfq: improve asymmetric scenarios detection")', if there are process groups with I/O requests waiting for completion, then BFQ tags the scenario as 'asymmetric'. This detection is needed for preserving service guarantees (for details, see comments on the computation * of the variable asymmetric_scenario in the function bfq_better_to_idle). Unfortunately, commit '2d29c9f89fcd ("block, bfq: improve asymmetric scenarios detection")' contains an error exactly in the updating of the number of groups with I/O requests waiting for completion: if a group has more than one descendant process, then the above number of groups, which is renamed from num_active_groups to a more appropriate num_groups_with_pending_reqs by this commit, may happen to be wrongly decremented multiple times, namely every time one of the descendant processes gets all its pending I/O requests completed. A correct, complete solution should work as follows. Consider a group that is inactive, i.e., that has no descendant process with pending I/O inside BFQ queues. Then suppose that num_groups_with_pending_reqs is still accounting for this group, because the group still has some descendant process with some I/O request still in flight. num_groups_with_pending_reqs should be decremented when the in-flight request of the last descendant process is finally completed (assuming that nothing else has changed for the group in the meantime, in terms of composition of the group and active/inactive state of child groups and processes). To accomplish this, an additional pending-request counter must be added to entities, and must be updated correctly. To avoid this additional field and operations, this commit resorts to the following tradeoff between simplicity and accuracy: for an inactive group that is still counted in num_groups_with_pending_reqs, this commit decrements num_groups_with_pending_reqs when the first descendant process of the group remains with no request waiting for completion. This simplified scheme provides a fix to the unbalanced decrements introduced by 2d29c9f89fcd. Since this error was also caused by lack of comments on this non-trivial issue, this commit also adds related comments. Fixes: 2d29c9f89fcd ("block, bfq: improve asymmetric scenarios detection") Reported-by: Steven Barrett <steven@liquorix.net> Tested-by: Steven Barrett <steven@liquorix.net> Tested-by: Lucjan Lucjanov <lucjan.lucjanov@gmail.com> Reviewed-by: Federico Motta <federico@willer.it> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07Merge tag 'gnss-4.20-rc6' of ↵Greg Kroah-Hartman2-3/+5
https://git.kernel.org/pub/scm/linux/kernel/git/johan/gnss into char-misc-linus Johan writes: GNSS fixes for 4.20-rc6 Here's a fix for a broken activation retry loop in the sirf driver. Included are also two MAINTAINERS updates. All have been in linux-next with no reported issues. Signed-off-by: Johan Hovold <johan@kernel.org> * tag 'gnss-4.20-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/gnss: MAINTAINERS: exclude gnss from SIRFPRIMA2 regex matching MAINTAINERS: add gnss scm tree gnss: sirf: fix activation retry handling