summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2019-10-15Merge drm/drm-next into drm-intel-next-queuedJoonas Lahtinen691-8133/+17368
Backmerging to pull in HDR DP code: https://lists.freedesktop.org/archives/dri-devel/2019-September/236453.html Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-10-14drm/i915/perf: allow holding preemption on filtered ctxLionel Landwerlin1-0/+11
We would like to make use of perf in Vulkan. The Vulkan API is much lower level than OpenGL, with applications directly exposed to the concept of command buffers (pretty much equivalent to our batch buffers). In Vulkan, queries are always limited in scope to a command buffer. In OpenGL, the lack of command buffer concept meant that queries' duration could span multiple command buffers. With that restriction gone in Vulkan, we would like to simplify measuring performance just by measuring the deltas between the counter snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the more complex scheme we currently have in the GL driver, using 2 MI_RECORD_PERF_COUNT commands and doing some post processing on the stream of OA reports, coming from the global OA buffer, to remove any unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. Disabling preemption only apply to a single context with which want to query performance counters for and is considered a privileged operation, by default protected by CAP_SYS_ADMIN. It is possible to enable it for a normal user by disabling the paranoid stream setting. v2: Store preemption setting in intel_context (Chris) v3: Use priorities to avoid preemption rather than the HW mechanism v4: Just modify the port priority reporting function v5: Add nopreempt flag on gem context and always flag requests appropriately, regarless of OA reconfiguration. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-4-chris@chris-wilson.co.uk
2019-10-14drm/i915/perf: Allow dynamic reconfiguration of the OA streamChris Wilson1-0/+13
Introduce a new perf_ioctl command to change the OA configuration of the active stream. This allows the OA stream to be reconfigured between batch buffers, giving greater flexibility in sampling. We inject a request into the OA context to reconfigure the stream asynchronously on the GPU in between and ordered with execbuffer calls. Original patch for dynamic reconfiguration by Lionel Landwerlin. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-3-chris@chris-wilson.co.uk
2019-10-14drm/i915: add support for perf configuration queriesLionel Landwerlin1-1/+61
Listing configurations at the moment is supported only through sysfs. This might cause issues for applications wanting to list configurations from a container where sysfs isn't available. This change adds a way to query the number of configurations and their content through the i915 query uAPI. v2: Fix sparse warnings (Lionel) Add support to query configuration using uuid (Lionel) v3: Fix some inconsistency in uapi header (Lionel) Fix unlocking when not locked issue (Lionel) Add debug messages (Lionel) v4: Fix missing unlock (Dan) v5: Drop lock when copying config content to userspace (Chris) v6: Drop lock when copying config list to userspace (Chris) Fix deadlock when calling i915_perf_get_oa_config() under perf.metrics_lock (Lionel) Add i915_oa_config_get() (Chris) Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-2-chris@chris-wilson.co.uk
2019-10-14drm/i915/perf: introduce a versioning of the i915-perf uapiLionel Landwerlin1-0/+21
Reporting this version will help application figure out what level of the support the running kernel provides. v2: Add i915_perf_ioctl_version() (Chris) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-1-chris@chris-wilson.co.uk
2019-10-11Merge tag 'drm-misc-next-2019-10-09-2' of ↵Dave Airlie25-191/+343
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.5: UAPI Changes: -Colorspace: Expose different prop values for DP vs. HDMI (Gwan-gyeong Mun) -fourcc: Add DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED (Raymond) -not_actually: s/ENOTSUPP/EOPNOTSUPP/ in drm_edid and drm_mipi_dbi. This should not reach userspace, but adding here to specifically call that out (Daniel) -i810: Prevent underflow in dispatch ioctls (Dan) -komeda: Add ACLK sysfs attribute (Mihail) -v3d: Allow userspace to clean up after render jobs (Iago) Cross-subsystem Changes: -MAINTAINERS: -Add Alyssa & Steven as panfrost reviewers (Rob) -Add Jernej as DE2 reviewer (Maxime) -Add Chen-Yu as Allwinner maintainer (Maxime) -staging: Make some stack arrays static const (Colin) Core Changes: -ttm: Allow drivers to specify their vma manager (to use gem mgr) (Gerd) -docs: Various fixes in connector/encoder/bridge docs (Daniel, Lyude, Laurent) -connector: Allow more than 3 possible encoders for a connector (José) -dp_cec: Allow a connector to be associated with a cec device (Dariusz) -various: Fix some compile/sparse warnings (Ville) -mm: Ensure mm node removals are properly serialised (Chris) -panel: Specify the type of panel for drm_panels for later use (Laurent) -panel: Use drm_panel_init to init device and funcs (Laurent) -mst: Refactors and cleanups in anticipation of suspend/resume support (Lyude) -vram: -Add lazy unmapping for gem bo's (Thomas) -Unify and rationalize vram mm and gem vram (Thomas) -Expose vmap and vunmap for gem vram objects (Thomas) -Allow objects to be pinned at the top of vram to avoid fragmentation (Thomas) Driver Changes: -various: Include drm_bridge.h instead of relying on drm_crtc.h (Boris) -ast/mgag200: Refactor show_cursor(), move cursor to top of video mem (Thomas) -komeda: -Add error event printing (behind CONFIG) and reg dump support (Lowry) -Add suspend/resume support (Lowry) -Workaround D71 shadow registers not flushing on disable (Lowry) -meson: Add suspend/resume support (Neil) -omap: Miscellaneous refactors and improvements (Tomi/Jyri) -panfrost/shmem: Silence lockdep by using mutex_trylock (Rob) -panfrost: Miscellaneous small fixes (Rob/Steven) -sti: Fix warnings (Benjamin/Linus) -sun4i: -Add vcc-dsi regulator to sun6i_mipi_dsi (Jagan) -A few patches to figure out the DRQ/start delay calc on dsi (Jagan/Icenowy) -virtio: -Add module param to switch resource reuse workaround on/off (Gerd) -Avoid calling vmexit while holding spinlock (Gerd) -Use gem shmem helpers instead of ttm (Gerd) -Accommodate command buffer allocations too big for cma (David) Cc: Rob Herring <robh@kernel.org> Cc: Maxime Ripard <mripard@kernel.org> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Lyude Paul <lyude@redhat.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Dariusz Marcinkiewicz <darekm@google.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Raymond Smith <raymond.smith@arm.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Colin Ian King <colin.king@canonical.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Mihail Atanassov <Mihail.Atanassov@arm.com> Cc: Lowry Li <Lowry.Li@arm.com> Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Jyri Sarha <jsarha@ti.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Benjamin Gaignard <benjamin.gaignard@st.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Jagan Teki <jagan@amarulasolutions.com> Cc: Icenowy Zheng <icenowy@aosc.io> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: David Riley <davidriley@chromium.org> Signed-off-by: Dave Airlie <airlied@redhat.com> # gpg: Signature made Thu 10 Oct 2019 01:00:47 AM AEST # gpg: using RSA key 732C002572DCAF79 # gpg: Can't check signature: public key not found # Conflicts: # drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c # drivers/gpu/drm/i915/i915_drv.c # drivers/gpu/drm/i915/i915_gem.c # drivers/gpu/drm/i915/i915_gem_gtt.c # drivers/gpu/drm/i915/i915_vma.c From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20191009150825.GA227673@art_vandelay
2019-10-08Merge tag 'drm-intel-next-2019-10-07' of ↵Dave Airlie3-23/+43
git://anongit.freedesktop.org/drm/drm-intel into drm-next UAPI Changes: - Never allow userptr into the mappable GGTT (Chris) No existing users. Avoid anyone from even trying to spare a deadlock scenario. Cross-subsystem Changes: Core Changes: Driver Changes: - Eliminate struct_mutex use as BKL! (Chris) Only used for execbuf serialisation. - Initialize DDI TC and TBT ports (D-I) on Tigerlake (Lucas) - Fix DKL link training for 2.7GHz and 1.62GHz (Jose) - Add Tigerlake DKL PHY programming sequences (Clinton) - Add Tigerlake Thunderbolt PLL divider values (Imre) - drm/i915: Use helpers for drm_mm_node booleans (Chris) - Restrict L3 remapping sysfs interface to dwords (Chris) - Fix audio power up sequence for gen10+ display (Kai) - Skip redundant execlist resubmission (Chris) - Only unwedge if we can reset GPU first (Chris) - Initialise breadcrumb lists on the virtual engine (Chris) - Don't rely on kernel context existing during early errors (Matt A) - Update Icelake+ MG_DP_MODE programming table (Clinton) - Update DMC firmware for Icelake (Anusha) - Downgrade DP MST error after unplugging TypeC cable (Srinivasan) - Limit MST modes based on plane size too (Ville) - Polish intel_tv_mode_valid() (Ville) - Fix g4x sprite scaling stride check with GTT remapping (Ville) - Don't advertize non-exisiting crtcs (Ville) - Clean up encoder->crtc_mask setup (Ville) - Use tc_port instead of port parameter to MG registers (Jose) - Remove static variable for aux last status (Jani) - Implement a better i945gm vblank irq vs. C-states workaround (Ville) - Make the object creation interface consistent (CQ) - Rename intel_vga_msr_write() to intel_vga_reset_io_mem() (Jani, Ville) - Eliminate previous drm_dbg/drm_err usage (Jani) - Move gmbus setup down to intel_modeset_init() (Jani) - Abstract all vgaarb access to intel_vga.[ch] (Jani) - Split out i915_switcheroo.[ch] from i915_drv.c (Jani) - Use intel_gt in has_reset* (Chris) - Eliminate return value for i915_gem_init_early (Matt A) - Selftest improvements (Chris) - Update HuC firmware header version number format (Daniele) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191007134801.GA24313@jlahtine-desk.ger.corp.intel.com
2019-10-07cec: add cec_adapter to cec_notifier_cec_adap_unregister()Hans Verkuil1-2/+5
It is possible for one HDMI connector to have multiple CEC adapters. The typical real-world scenario is that where one adapter is used when the device is in standby, and one that's better/smarter when the device is powered up. The cec-notifier changes were made with that in mind, but I missed that in order to support this you need to tell cec_notifier_cec_adap_unregister() which adapter you are unregistering from the notifier. Add this additional argument. It is currently unused, but once all drivers use this, the CEC core will be adapted for these use-cases. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/e9fc8740-6be6-43a7-beee-ce2d7b54936e@xs4all.nl
2019-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds5-6/+16
Pull networking fixes from David Miller: 1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold. 2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric Dumazet. 3) txq null deref in mac80211, from Miaoqing Pan. 4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann. 5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso. 6) Avoid division by zero in taprio scheduler, from Vladimir Oltean. 7) Various xgmac fixes in stmmac driver from Jose Abreu. 8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from Michal Kubecek. 9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij. 10) Fix sleep while atomic in sja1105, from Vladimir Oltean. 11) Suspend/resume deadlock in stmmac, from Thierry Reding. 12) Various UDP GSO fixes from Josh Hunt. 13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric Dumazet. 14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern. 15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) selftests/net: add nettest to .gitignore net: qlogic: Fix memory leak in ql_alloc_large_buffers nfc: fix memory leak in llcp_sock_bind() sch_dsmark: fix potential NULL deref in dsmark_init() net: phy: at803x: use operating parameters from PHY-specific status net: phy: extract pause mode net: phy: extract link partner advertisement reading net: phy: fix write to mii-ctrl1000 register ipv6: Handle missing host route in __ipv6_ifa_notify net: phy: allow for reset line to be tied to a sleepy GPIO controller net: ipv4: avoid mixed n_redirects and rate_tokens usage r8152: Set macpassthru in reset_resume callback cxgb4:Fix out-of-bounds MSI-X info array access Revert "ipv6: Handle race in addrconf_dad_work" net: make sock_prot_memory_pressure() return "const char *" rxrpc: Fix rxrpc_recvmsg tracepoint qmi_wwan: add support for Cinterion CLS8 devices tcp: fix slab-out-of-bounds in tcp_zerocopy_receive() lib: textsearch: fix escapes in example code udp: only do GSO if # of segs > 1 ...
2019-10-05net: phy: extract pause modeRussell King1-0/+1
Extract the update of phylib's software pause mode state from genphy_read_status(), so that we can re-use this functionality with PHYs that have alternative ways to read the negotiation results. Tested-by: tinywrkb <tinywrkb@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-05net: phy: extract link partner advertisement readingRussell King1-0/+1
Move reading the link partner advertisement out of genphy_read_status() into its own separate function. This will allow re-use of this code by PHY drivers that are able to read the resolved status from the PHY. Tested-by: tinywrkb <tinywrkb@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-05net: phy: fix write to mii-ctrl1000 registerRussell King1-0/+9
When userspace writes to the MII_ADVERTISE register, we update phylib's advertising mask and trigger a renegotiation. However, writing to the MII_CTRL1000 register, which contains the gigabit advertisement, does neither. This can lead to phylib's copy of the advertisement becoming de-synced with the values in the PHY register set, which can result in incorrect negotiation resolution. Fixes: 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04rxrpc: Fix rxrpc_recvmsg tracepointDavid Howells1-1/+1
Fix the rxrpc_recvmsg tracepoint to handle being called with a NULL call parameter. Fixes: a25e21f0bcd2 ("rxrpc, afs: Use debug_ids rather than pointers in traces") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+2
Pull KVM fixes from Paolo Bonzini: "ARM and x86 bugfixes of all kinds. The most visible one is that migrating a nested hypervisor has always been busted on Broadwell and newer processors, and that has finally been fixed" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) KVM: x86: omit "impossible" pmu MSRs from MSR list KVM: nVMX: Fix consistency check on injected exception error code KVM: x86: omit absent pmu MSRs from MSR list selftests: kvm: Fix libkvm build error kvm: vmx: Limit guest PMCs to those supported on the host kvm: x86, powerpc: do not allow clearing largepages debugfs entry KVM: selftests: x86: clarify what is reported on KVM_GET_MSRS failure KVM: VMX: Set VMENTER_L1D_FLUSH_NOT_REQUIRED if !X86_BUG_L1TF selftests: kvm: add test for dirty logging inside nested guests KVM: x86: fix nested guest live migration with PML KVM: x86: assign two bits to track SPTE kinds KVM: x86: Expose XSAVEERPTR to the guest kvm: x86: Enumerate support for CLZERO instruction kvm: x86: Use AMD CPUID semantics for AMD vCPUs kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH KVM: X86: Fix userspace set invalid CR4 kvm: x86: Fix a spurious -E2BIG in __do_cpuid_func KVM: LAPIC: Loosen filter for adaptive tuning of lapic_timer_advance_ns KVM: arm/arm64: vgic: Use the appropriate TRACE_INCLUDE_PATH arm64: KVM: Kill hyp_alternate_select() ...
2019-10-04Merge tag 'for-linus-5.4-rc2-tag' of ↵Linus Torvalds1-24/+1
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes and cleanups from Juergen Gross: - a fix in the Xen balloon driver avoiding hitting a BUG_ON() in some cases, plus a follow-on cleanup series for that driver - a patch for introducing non-blocking EFI callbacks in Xen's EFI driver, plu a cleanup patch for Xen EFI handling merging the x86 and ARM arch specific initialization into the Xen EFI driver - a fix of the Xen xenbus driver avoiding a self-deadlock when cleaning up after a user process has died - a fix for Xen on ARM after removal of ZONE_DMA - a cleanup patch for avoiding build warnings for Xen on ARM * tag 'for-linus-5.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/xenbus: fix self-deadlock after killing user process xen/efi: have a common runtime setup function arm: xen: mm: use __GPF_DMA32 for arm64 xen/balloon: Clear PG_offline in balloon_retrieve() xen/balloon: Mark pages PG_offline in balloon_append() xen/balloon: Drop __balloon_append() xen/balloon: Set pages PageOffline() in balloon_add_region() ARM: xen: unexport HYPERVISOR_platform_op function xen/efi: Set nonblocking callbacks
2019-10-04Merge tag 'copy-struct-from-user-v5.4-rc2' of ↵Linus Torvalds3-0/+79
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull copy_struct_from_user() helper from Christian Brauner: "This contains the copy_struct_from_user() helper which got split out from the openat2() patchset. It is a generic interface designed to copy a struct from userspace. The helper will be especially useful for structs versioned by size of which we have quite a few. This allows for backwards compatibility, i.e. an extended struct can be passed to an older kernel, or a legacy struct can be passed to a newer kernel. For the first case (extended struct, older kernel) the new fields in an extended struct can be set to zero and the struct safely passed to an older kernel. The most obvious benefit is that this helper lets us get rid of duplicate code present in at least sched_setattr(), perf_event_open(), and clone3(). More importantly it will also help to ensure that users implementing versioning-by-size end up with the same core semantics. This point is especially crucial since we have at least one case where versioning-by-size is used but with slighly different semantics: sched_setattr(), perf_event_open(), and clone3() all do do similar checks to copy_struct_from_user() while rt_sigprocmask(2) always rejects differently-sized struct arguments. With this pull request we also switch over sched_setattr(), perf_event_open(), and clone3() to use the new helper" * tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: usercopy: Add parentheses around assignment in test_copy_struct_from_user perf_event_open: switch to copy_struct_from_user() sched_setattr: switch to copy_struct_from_user() clone3: switch to copy_struct_from_user() lib: introduce copy_struct_from_user() helper
2019-10-04Merge tag 'for-linus-20191003' of ↵Linus Torvalds1-2/+26
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull clone3/pidfd fixes from Christian Brauner: "This contains a couple of fixes: - Fix pidfd selftest compilation (Shuah Kahn) Due to a false linking instruction in the Makefile compilation for the pidfd selftests would fail on some systems. - Fix compilation for glibc on RISC-V systems (Seth Forshee) In some scenarios linux/uapi/linux/sched.h is included where __ASSEMBLY__ is defined causing a build failure because struct clone_args was not guarded by an #ifndef __ASSEMBLY__. - Add missing clone3() and struct clone_args kernel-doc (Christian Brauner) clone3() and struct clone_args were missing kernel-docs. (The goal is to use kernel-doc for any function or type where it's worth it.) For struct clone_args this also contains a comment about the fact that it's versioned by size" * tag 'for-linus-20191003' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: sched: add kernel-doc for struct clone_args fork: add kernel-doc for clone3 selftests: pidfd: Fix undefined reference to pthread_create() sched: Add __ASSEMBLY__ guards around struct clone_args
2019-10-04Merge tag 'drm-fixes-2019-10-04' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-0/+2
Pull drm fixes from Dave Airlie: "Been offline for 3 days, got back and had some fixes queued up. Nothing too major, the i915 dp-mst fix is important, and amdgpu has a bulk move speedup fix and some regressions, but nothing too insane for an rc2 pull. The intel fixes are also 2 weeks worth, they missed the boat last week. core: - writeback fixes i915: - Fix DP-MST crtc_mask - Fix dsc dpp calculations - Fix g4x sprite scaling stride check with GTT remapping - Fix concurrence on cases where requests where getting retired at same time as resubmitted to HW - Fix gen9 display resolutions by setting the right max plane width - Fix GPU hang on preemption - Mark contents as dirty on a write fault. This was breaking cursor sprite with dumb buffers. komeda: - memory leak fix tilcdc: - include fix amdgpu: - Enable bulk moves - Power metrics fixes for Navi - Fix S4 regression - Add query for tcc disabled mask - Fix several leaks in error paths - randconfig fixes - clang fixes" * tag 'drm-fixes-2019-10-04' of git://anongit.freedesktop.org/drm/drm: (21 commits) Revert "drm/i915: Fix DP-MST crtc_mask" drm/omap: fix max fclk divider for omap36xx drm/i915: Fix g4x sprite scaling stride check with GTT remapping drm/i915/dp: Fix dsc bpp calculations, v5. drm/amd/display: fix dcn21 Makefile for clang drm/amd/display: hide an unused variable drm/amdgpu: display_mode_vba_21: remove uint typedef drm/amdgpu: hide another #warning drm/amdgpu: make pmu support optional, again drm/amd/display: memory leak drm/amdgpu: fix multiple memory leaks in acp_hw_init drm/amdgpu: return tcc_disabled_mask to userspace drm/amdgpu: don't increment vram lost if we are in hibernation Revert "drm/amdgpu: disable stutter mode for renoir" drm/amd/powerplay: add sensor lock support for smu drm/amd/powerplay: change metrics update period from 1ms to 100ms drm/amdgpu: revert "disable bulk moves for now" drm/tilcdc: include linux/pinctrl/consumer.h again drm/komeda: prevent memory leak in komeda_wb_connector_add drm: Clear the fence pointer when writeback job signaled ...
2019-10-04Merge tag 'for-linus-2019-10-03' of git://git.kernel.dk/linux-blockLinus Torvalds2-1/+27
Pull block fixes from Jens Axboe: - Mandate timespec64 for the io_uring timeout ABI (Arnd) - Set of NVMe changes via Sagi: - controller removal race fix from Balbir - quirk additions from Gabriel and Jian-Hong - nvme-pci power state save fix from Mario - Add 64bit user commands (for 64bit registers) from Marta - nvme-rdma/nvme-tcp fixes from Max, Mark and Me - Minor cleanups and nits from James, Dan and John - Two s390 dasd fixes (Jan, Stefan) - Have loop change block size in DIO mode (Martijn) - paride pg header ifdef guard (Masahiro) - Two blk-mq queue scheduler tweaks, fixing an ordering issue on zoned devices and suboptimal performance on others (Ming) * tag 'for-linus-2019-10-03' of git://git.kernel.dk/linux-block: (22 commits) block: sed-opal: fix sparse warning: convert __be64 data block: sed-opal: fix sparse warning: obsolete array init. block: pg: add header include guard Revert "s390/dasd: Add discard support for ESE volumes" s390/dasd: Fix error handling during online processing io_uring: use __kernel_timespec in timeout ABI loop: change queue block size to match when using DIO blk-mq: apply normal plugging for HDD blk-mq: honor IO scheduler for multiqueue devices nvme-rdma: fix possible use-after-free in connect timeout nvme: Move ctrl sqsize to generic space nvme: Add ctrl attributes for queue_count and sqsize nvme: allow 64-bit results in passthru commands nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T nvmet-tcp: remove superflous check on request sgl Added QUIRKs for ADATA XPG SX8200 Pro 512GB nvme-rdma: Fix max_hw_sectors calculation nvme: fix an error code in nvme_init_subsystem() nvme-pci: Save PCI state before putting drive into deepest state nvme-tcp: fix wrong stop condition in io_work ...
2019-10-04drm/fourcc: Add Arm 16x16 block modifierRaymond Smith1-1/+25
Add the DRM_FORMAT_MOD_ARM_16X16_BLOCK_U_INTERLEAVED modifier to denote the 16x16 block u-interleaved format used in Arm Utgard and Midgard GPUs. Changes from v1:- 1. Reserved the upper four bits (out of the 56 bits assigned to each vendor) to denote the category of Arm specific modifiers. Currently, we have two categories ie AFBC and MISC. Changes from v2:- 1. Preserved Ray's authorship 2. Cleanups/changes suggested by Brian 3. Added r-bs of Brian and Qiang Signed-off-by: Raymond Smith <raymond.smith@arm.com> Reviewed-by: Brian Starkey <brian.starkey@arm.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Signed-off-by: Ayan kumar halder <ayan.halder@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004141222.22337-1-ayan.halder@arm.com
2019-10-04drm/mm: Convert drm_mm_node booleans to bitopsChris Wilson1-3/+4
A straightforward conversion of assignment and checking of the boolean state flags (allocated, scanned) into non-atomic bitops. The caller remains responsible for all locking around the drm_mm and its nodes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191003210100.22250-4-chris@chris-wilson.co.uk
2019-10-03sched: add kernel-doc for struct clone_argsChristian Brauner1-2/+24
Add kernel-doc for struct clone_args for the clone3() syscall. Link: https://lore.kernel.org/r/20191001114701.24661-3-christian.brauner@ubuntu.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-10-03Merge drm/drm-next into drm-misc-nextMaxime Ripard708-8186/+17198
We haven't done any backmerge for a while due to the merge window, and it starts to become an issue for komeda. Let's bring 5.4-rc1 in. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2019-10-03block: pg: add header include guardMasahiro Yamada1-1/+4
Add a header include guard just in case. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller1-4/+1
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Remove the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). Patch from Florian Westphal. 2) Fix deadlock in nft_connlimit between packet path updates and the garbage collector. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02drm/amdgpu: return tcc_disabled_mask to userspaceMarek Olšák1-0/+2
UMDs need this for correct programming of harvested chips. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-02net: dsa: sja1105: Fix sleeping while atomic in .port_hwtstamp_setVladimir Oltean1-1/+3
Currently this stack trace can be seen with CONFIG_DEBUG_ATOMIC_SLEEP=y: [ 41.568348] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909 [ 41.576757] in_atomic(): 1, irqs_disabled(): 0, pid: 208, name: ptp4l [ 41.583212] INFO: lockdep is turned off. [ 41.587123] CPU: 1 PID: 208 Comm: ptp4l Not tainted 5.3.0-rc6-01445-ge950f2d4bc7f-dirty #1827 [ 41.599873] [<c0313d7c>] (unwind_backtrace) from [<c030e13c>] (show_stack+0x10/0x14) [ 41.607584] [<c030e13c>] (show_stack) from [<c1212d50>] (dump_stack+0xd4/0x100) [ 41.614863] [<c1212d50>] (dump_stack) from [<c037dfc8>] (___might_sleep+0x1c8/0x2b4) [ 41.622574] [<c037dfc8>] (___might_sleep) from [<c122ea90>] (__mutex_lock+0x48/0xab8) [ 41.630368] [<c122ea90>] (__mutex_lock) from [<c122f51c>] (mutex_lock_nested+0x1c/0x24) [ 41.638340] [<c122f51c>] (mutex_lock_nested) from [<c0c6fe08>] (sja1105_static_config_reload+0x30/0x27c) [ 41.647779] [<c0c6fe08>] (sja1105_static_config_reload) from [<c0c7015c>] (sja1105_hwtstamp_set+0x108/0x1cc) [ 41.657562] [<c0c7015c>] (sja1105_hwtstamp_set) from [<c0feb650>] (dev_ifsioc+0x18c/0x330) [ 41.665788] [<c0feb650>] (dev_ifsioc) from [<c0febbd8>] (dev_ioctl+0x320/0x6e8) [ 41.673064] [<c0febbd8>] (dev_ioctl) from [<c0f8b1f4>] (sock_ioctl+0x334/0x5e8) [ 41.680340] [<c0f8b1f4>] (sock_ioctl) from [<c05404a8>] (do_vfs_ioctl+0xb0/0xa10) [ 41.687789] [<c05404a8>] (do_vfs_ioctl) from [<c0540e3c>] (ksys_ioctl+0x34/0x58) [ 41.695151] [<c0540e3c>] (ksys_ioctl) from [<c0301000>] (ret_fast_syscall+0x0/0x28) [ 41.702768] Exception stack(0xe8495fa8 to 0xe8495ff0) [ 41.707796] 5fa0: beff4a8c 00000001 00000011 000089b0 beff4a8c beff4a80 [ 41.715933] 5fc0: beff4a8c 00000001 0000000c 00000036 b6fa98c8 004e19c1 00000001 00000000 [ 41.724069] 5fe0: 004dcedc beff4a6c 004c0738 b6e7af4c [ 41.729860] BUG: scheduling while atomic: ptp4l/208/0x00000002 [ 41.735682] INFO: lockdep is turned off. Enabling RX timestamping will logically disturb the fastpath (processing of meta frames). Replace bool hwts_rx_en with a bit that is checked atomically from the fastpath and temporarily unset from the sleepable context during a change of the RX timestamping process (a destructive operation anyways, requires switch reset). If found unset, the fastpath (net/dsa/tag_sja1105.c) will just drop any received meta frame and not take the meta_lock at all. Fixes: a602afd200f5 ("net: dsa: sja1105: Expose PTP timestamping ioctls to userspace") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02xen/efi: have a common runtime setup functionJuergen Gross1-24/+1
Today the EFI runtime functions are setup in architecture specific code (x86 and arm), with the functions themselves living in drivers/xen as they are not architecture dependent. As the setup is exactly the same for arm and x86 move the setup to drivers/xen, too. This at once removes the need to make the single functions global visible. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> [boris: "Dropped EXPORT_SYMBOL_GPL(xen_efi_runtime_setup)"] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2019-10-02drm/print: add drm_debug_enabled()Jani Nikula1-0/+5
Add helper to check if a drm debug category is enabled. Convert drm core to use it. No functional changes. v2: Move unlikely() to drm_debug_enabled() (Eric) v3: Keep unlikely() when combined with other conditions (Eric) Cc: Eric Engestrom <eric@engestrom.ch> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191001140614.26909-1-jani.nikula@intel.com
2019-10-02drm/print: move drm_debug variable to drm_print.[ch]Jani Nikula2-2/+2
Move drm_debug variable declaration and definition to where they are relevant and needed. No functional changes. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/71a566c68883b6e6c61414cd9f7c36c84015edb1.1569329774.git.jani.nikula@intel.com
2019-10-01netfilter: drop bridge nf reset from nf_resetFlorian Westphal1-4/+1
commit 174e23810cd31 ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi recycle always drop skb extensions. The additional skb_ext_del() that is performed via nf_reset on napi skb recycle is not needed anymore. Most nf_reset() calls in the stack are there so queued skb won't block 'rmmod nf_conntrack' indefinitely. This removes the skb_ext_del from nf_reset, and renames it to a more fitting nf_reset_ct(). In a few selected places, add a call to skb_ext_reset to make sure that no active extensions remain. I am submitting this for "net", because we're still early in the release cycle. The patch applies to net-next too, but I think the rename causes needless divergence between those trees. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-01drm/rect: Add drm_rect_init()Ville Syrjälä1-0/+17
Add a helper to initialize a rectangle from x/y/w/h information. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190930134214.24702-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2019-10-01drm/rect: Add drm_rect_translate_to()Ville Syrjälä1-0/+14
Add a helper to translate a rectangle to an absolute position. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190930134214.24702-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2019-10-01clone3: switch to copy_struct_from_user()Aleksa Sarai1-0/+2
Switch clone3() syscall from it's own copying struct clone_args from userspace to the new dedicated copy_struct_from_user() helper. The change is very straightforward, and helps unify the syscall interface for struct-from-userspace syscalls. Additionally, explicitly define CLONE_ARGS_SIZE_VER0 to match the other users of the struct-extension pattern. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> [christian.brauner@ubuntu.com: improve commit message] Link: https://lore.kernel.org/r/20191001011055.19283-3-cyphar@cyphar.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-10-01lib: introduce copy_struct_from_user() helperAleksa Sarai2-0/+77
A common pattern for syscall extensions is increasing the size of a struct passed from userspace, such that the zero-value of the new fields result in the old kernel behaviour (allowing for a mix of userspace and kernel vintages to operate on one another in most cases). While this interface exists for communication in both directions, only one interface is straightforward to have reasonable semantics for (userspace passing a struct to the kernel). For kernel returns to userspace, what the correct semantics are (whether there should be an error if userspace is unaware of a new extension) is very syscall-dependent and thus probably cannot be unified between syscalls (a good example of this problem is [1]). Previously there was no common lib/ function that implemented the necessary extension-checking semantics (and different syscalls implemented them slightly differently or incompletely[2]). Future patches replace common uses of this pattern to make use of copy_struct_from_user(). Some in-kernel selftests that insure that the handling of alignment and various byte patterns are all handled identically to memchr_inv() usage. [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code") [2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do similar checks to copy_struct_from_user() while rt_sigprocmask(2) always rejects differently-sized struct arguments. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191001011055.19283-2-cyphar@cyphar.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-09-30sched: Add __ASSEMBLY__ guards around struct clone_argsSeth Forshee1-0/+2
The addition of struct clone_args to uapi/linux/sched.h is not protected by __ASSEMBLY__ guards, causing a failure to build from source for glibc on RISC-V. Add the guards to fix this. Fixes: 7f192e3cd316 ("fork: add clone3") Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Cc: <stable@vger.kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20190917071853.12385-1-seth.forshee@canonical.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-09-30kvm: x86, powerpc: do not allow clearing largepages debugfs entryPaolo Bonzini1-0/+2
The largepages debugfs entry is incremented/decremented as shadow pages are created or destroyed. Clearing it will result in an underflow, which is harmless to KVM but ugly (and could be misinterpreted by tools that use debugfs information), so make this particular statistic read-only. Cc: kvm-ppc@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-30Merge tag 'trace-v5.4-3' of ↵Linus Torvalds1-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "A few more tracing fixes: - Fix a buffer overflow by checking nr_args correctly in probes - Fix a warning that is reported by clang - Fix a possible memory leak in error path of filter processing - Fix the selftest that checks for failures, but wasn't failing - Minor clean up on call site output of a memory trace event" * tag 'trace-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: selftests/ftrace: Fix same probe error test mm, tracing: Print symbol name for call_site in trace events tracing: Have error path in predicate_parse() free its allocated memory tracing: Fix clang -Wint-in-bool-context warnings in IF_ASSIGN macro tracing/probe: Fix to check the difference of nr_args before adding probe
2019-09-29Merge tag 'libnvdimm-fixes-5.4-rc1' of ↵Linus Torvalds2-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm More libnvdimm updates from Dan Williams: - Complete the reworks to interoperate with powerpc dynamic huge page sizes - Fix a crash due to missed accounting for the powerpc 'struct page'-memmap mapping granularity - Fix badblock initialization for volatile (DRAM emulated) pmem ranges - Stop triggering request_key() notifications to userspace when NVDIMM-security is disabled / not present - Miscellaneous small fixups * tag 'libnvdimm-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm/region: Enable MAP_SYNC for volatile regions libnvdimm: prevent nvdimm from requesting key when security is disabled libnvdimm/region: Initialize bad block for volatile namespaces libnvdimm/nfit_test: Fix acpi_handle redefinition libnvdimm/altmap: Track namespace boundaries in altmap libnvdimm: Fix endian conversion issues  libnvdimm/dax: Pick the right alignment default when creating dax devices powerpc/book3s64: Export has_transparent_hugepage() related functions.
2019-09-29Merge branch 'linus' of ↵Linus Torvalds1-29/+0
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal SoC updates from Eduardo Valentin: "This is a really small pull in the midst of a lot of pending patches. We are in the middle of restructuring how we are maintaining the thermal subsystem, as per discussion in our last LPC. For now, I am sending just some changes that were pending in my tree. Looking forward to get a more streamlined process in the next merge window" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: db8500: Rewrite to be a pure OF sensor thermal: db8500: Use dev helper variable thermal: db8500: Finalize device tree conversion thermal: thermal_mmio: remove some dead code
2019-09-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds11-34/+65
Pull networking fixes from David Miller: 1) Sanity check URB networking device parameters to avoid divide by zero, from Oliver Neukum. 2) Disable global multicast filter in NCSI, otherwise LLDP and IPV6 don't work properly. Longer term this needs a better fix tho. From Vijay Khemka. 3) Small fixes to selftests (use ping when ping6 is not present, etc.) from David Ahern. 4) Bring back rt_uses_gateway member of struct rtable, it's semantics were not well understood and trying to remove it broke things. From David Ahern. 5) Move usbnet snaity checking, ignore endpoints with invalid wMaxPacketSize. From Bjørn Mork. 6) Missing Kconfig deps for sja1105 driver, from Mao Wenan. 7) Various small fixes to the mlx5 DR steering code, from Alaa Hleihel, Alex Vesker, and Yevgeny Kliteynik 8) Missing CAP_NET_RAW checks in various places, from Ori Nimron. 9) Fix crash when removing sch_cbs entry while offloading is enabled, from Vinicius Costa Gomes. 10) Signedness bug fixes, generally in looking at the result given by of_get_phy_mode() and friends. From Dan Crapenter. 11) Disable preemption around BPF_PROG_RUN() calls, from Eric Dumazet. 12) Don't create VRF ipv6 rules if ipv6 is disabled, from David Ahern. 13) Fix quantization code in tcp_bbr, from Kevin Yang. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (127 commits) net: tap: clean up an indentation issue nfp: abm: fix memory leak in nfp_abm_u32_knode_replace tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state sk_buff: drop all skb extensions on free and skb scrubbing tcp_bbr: fix quantization code to not raise cwnd if not probing bandwidth mlxsw: spectrum_flower: Fail in case user specifies multiple mirror actions Documentation: Clarify trap's description mlxsw: spectrum: Clear VLAN filters during port initialization net: ena: clean up indentation issue NFC: st95hf: clean up indentation issue net: phy: micrel: add Asym Pause workaround for KSZ9021 net: socionext: ave: Avoid using netdev_err() before calling register_netdev() ptp: correctly disable flags on old ioctls lib: dimlib: fix help text typos net: dsa: microchip: Always set regmap stride to 1 nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs net/sched: Set default of CONFIG_NET_TC_SKB_EXT to N vrf: Do not attempt to create IPv6 mcast rule if IPv6 is disabled net: sched: sch_sfb: don't call qdisc_put() while holding tree lock ...
2019-09-29Merge branch 'hugepage-fallbacks' (hugepatch patches from David Rientjes)Linus Torvalds2-6/+8
Merge hugepage allocation updates from David Rientjes: "We (mostly Linus, Andrea, and myself) have been discussing offlist how to implement a sane default allocation strategy for hugepages on NUMA platforms. With these reverts in place, the page allocator will happily allocate a remote hugepage immediately rather than try to make a local hugepage available. This incurs a substantial performance degradation when memory compaction would have otherwise made a local hugepage available. This series reverts those reverts and attempts to propose a more sane default allocation strategy specifically for hugepages. Andrea acknowledges this is likely to fix the swap storms that he originally reported that resulted in the patches that removed __GFP_THISNODE from hugepage allocations. The immediate goal is to return 5.3 to the behavior the kernel has implemented over the past several years so that remote hugepages are not immediately allocated when local hugepages could have been made available because the increased access latency is untenable. The next goal is to introduce a sane default allocation strategy for hugepages allocations in general regardless of the configuration of the system so that we prevent thrashing of local memory when compaction is unlikely to succeed and can prefer remote hugepages over remote native pages when the local node is low on memory." Note on timing: this reverts the hugepage VM behavior changes that got introduced fairly late in the 5.3 cycle, and that fixed a huge performance regression for certain loads that had been around since 4.18. Andrea had this note: "The regression of 4.18 was that it was taking hours to start a VM where 3.10 was only taking a few seconds, I reported all the details on lkml when it was finally tracked down in August 2018. https://lore.kernel.org/linux-mm/20180820032640.9896-2-aarcange@redhat.com/ __GFP_THISNODE in MADV_HUGEPAGE made the above enterprise vfio workload degrade like in the "current upstream" above. And it still would have been that bad as above until 5.3-rc5" where the bad behavior ends up happening as you fill up a local node, and without that change, you'd get into the nasty swap storm behavior due to compaction working overtime to make room for more memory on the nodes. As a result 5.3 got the two performance fix reverts in rc5. However, David Rientjes then noted that those performance fixes in turn regressed performance for other loads - although not quite to the same degree. He suggested reverting the reverts and instead replacing them with two small changes to how hugepage allocations are done (patch descriptions rephrased by me): - "avoid expensive reclaim when compaction may not succeed": just admit that the allocation failed when you're trying to allocate a huge-page and compaction wasn't successful. - "allow hugepage fallback to remote nodes when madvised": when that node-local huge-page allocation failed, retry without forcing the local node. but by then I judged it too late to replace the fixes for a 5.3 release. So 5.3 was released with behavior that harked back to the pre-4.18 logic. But now we're in the merge window for 5.4, and we can see if this alternate model fixes not just the horrendous swap storm behavior, but also restores the performance regression that the late reverts caused. Fingers crossed. * emailed patches from David Rientjes <rientjes@google.com>: mm, page_alloc: allow hugepage fallback to remote nodes when madvised mm, page_alloc: avoid expensive reclaim when compaction may not succeed Revert "Revert "Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"" Revert "Revert "mm, thp: restore node-local hugepage allocations""
2019-09-29mm, tracing: Print symbol name for call_site in trace eventsChangbin Du1-3/+4
To improve the readability of raw slab trace points, print the call_site ip using '%pS'. Then we can grep events with function names. [002] .... 808.188897: kmem_cache_free: call_site=putname+0x47/0x50 ptr=00000000cef40c80 [002] .... 808.188898: kfree: call_site=security_cred_free+0x42/0x50 ptr=0000000062400820 [002] .... 808.188904: kmem_cache_free: call_site=put_cred_rcu+0x88/0xa0 ptr=0000000058d74ef8 [002] .... 808.188913: kmem_cache_alloc: call_site=prepare_creds+0x26/0x100 ptr=0000000058d74ef8 bytes_req=168 bytes_alloc=576 gfp_flags=GFP_KERNEL [002] .... 808.188917: kmalloc: call_site=security_prepare_creds+0x77/0xa0 ptr=0000000062400820 bytes_req=8 bytes_alloc=336 gfp_flags=GFP_KERNEL|__GFP_ZERO [002] .... 808.188920: kmem_cache_alloc: call_site=getname_flags+0x4f/0x1e0 ptr=00000000cef40c80 bytes_req=4096 bytes_alloc=4480 gfp_flags=GFP_KERNEL [002] .... 808.188925: kmem_cache_free: call_site=putname+0x47/0x50 ptr=00000000cef40c80 [002] .... 808.188926: kfree: call_site=security_cred_free+0x42/0x50 ptr=0000000062400820 [002] .... 808.188931: kmem_cache_free: call_site=put_cred_rcu+0x88/0xa0 ptr=0000000058d74ef8 Link: http://lkml.kernel.org/r/20190914103215.23301-1-changbin.du@gmail.com Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-09-29Revert "Revert "Revert "mm, thp: consolidate THP gfp handling into ↵David Rientjes1-4/+8
alloc_hugepage_direct_gfpmask"" This reverts commit 92717d429b38e4f9f934eed7e605cc42858f1839. Since commit a8282608c88e ("Revert "mm, thp: restore node-local hugepage allocations"") is reverted in this series, it is better to restore the previous 5.2 behavior between the thp allocation and the page allocator rather than to attempt any consolidation or cleanup for a policy that is now reverted. It's less risky during an rc cycle and subsequent patches in this series further modify the same policy that the pre-5.3 behavior implements. Consolidation and cleanup can be done subsequent to a sane default page allocation strategy, so this patch reverts a cleanup done on a strategy that is now reverted and thus is the least risky option. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-29Revert "Revert "mm, thp: restore node-local hugepage allocations""David Rientjes1-2/+0
This reverts commit a8282608c88e08b1782141026eab61204c1e533f. The commit references the original intended semantic for MADV_HUGEPAGE which has subsequently taken on three unique purposes: - enables or disables thp for a range of memory depending on the system's config (is thp "enabled" set to "always" or "madvise"), - determines the synchronous compaction behavior for thp allocations at fault (is thp "defrag" set to "always", "defer+madvise", or "madvise"), and - reverts a previous MADV_NOHUGEPAGE (there is no madvise mode to only clear previous hugepage advice). These are the three purposes that currently exist in 5.2 and over the past several years that userspace has been written around. Adding a NUMA locality preference adds a fourth dimension to an already conflated advice mode. Based on the semantic that MADV_HUGEPAGE has provided over the past several years, there exist workloads that use the tunable based on these principles: specifically that the allocation should attempt to defragment a local node before falling back. It is agreed that remote hugepages typically (but not always) have a better access latency than remote native pages, although on Naples this is at parity for intersocket. The revert commit that this patch reverts allows hugepage allocation to immediately allocate remotely when local memory is fragmented. This is contrary to the semantic of MADV_HUGEPAGE over the past several years: that is, memory compaction should be attempted locally before falling back. The performance degradation of remote hugepages over local hugepages on Rome, for example, is 53.5% increased access latency. For this reason, the goal is to revert back to the 5.2 and previous behavior that would attempt local defragmentation before falling back. With the patch that is reverted by this patch, we see performance degradations at the tail because the allocator happily allocates the remote hugepage rather than even attempting to make a local hugepage available. zone_reclaim_mode is not a solution to this problem since it does not only impact hugepage allocations but rather changes the memory allocation strategy for *all* page allocations. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-28Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds5-27/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Apply a number of membarrier related fixes and cleanups, which fixes a use-after-free race in the membarrier code - Introduce proper RCU protection for tasks on the runqueue - to get rid of the subtle task_rcu_dereference() interface that was easy to get wrong - Misc fixes, but also an EAS speedup * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Avoid redundant EAS calculation sched/core: Remove double update_max_interval() call on CPU startup sched/core: Fix preempt_schedule() interrupt return comment sched/fair: Fix -Wunused-but-set-variable warnings sched/core: Fix migration to invalid CPU in __set_cpus_allowed_ptr() sched/membarrier: Return -ENOMEM to userspace on memory allocation failure sched/membarrier: Skip IPIs when mm->mm_users == 1 selftests, sched/membarrier: Add multi-threaded test sched/membarrier: Fix p->mm->membarrier_state racy load sched/membarrier: Call sync_core only before usermode for same mm sched/membarrier: Remove redundant check sched/membarrier: Fix private expedited registration check tasks, sched/core: RCUify the assignment of rq->curr tasks, sched/core: With a grace period after finish_task_switch(), remove unnecessary code tasks, sched/core: Ensure tasks are available for a grace period after leaving the runqueue tasks: Add a count of task RCU users sched/core: Convert vcpu_is_preempted() from macro to an inline function sched/fair: Remove unused cfs_rq_clock_task() function
2019-09-28Merge branch 'next-lockdown' of ↵Linus Torvalds6-3/+96
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...
2019-09-28Merge branch 'next-integrity' of ↵Linus Torvalds4-3/+60
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "The major feature in this time is IMA support for measuring and appraising appended file signatures. In addition are a couple of bug fixes and code cleanup to use struct_size(). In addition to the PE/COFF and IMA xattr signatures, the kexec kernel image may be signed with an appended signature, using the same scripts/sign-file tool that is used to sign kernel modules. Similarly, the initramfs may contain an appended signature. This contained a lot of refactoring of the existing appended signature verification code, so that IMA could retain the existing framework of calculating the file hash once, storing it in the IMA measurement list and extending the TPM, verifying the file's integrity based on a file hash or signature (eg. xattrs), and adding an audit record containing the file hash, all based on policy. (The IMA support for appended signatures patch set was posted and reviewed 11 times.) The support for appended signature paves the way for adding other signature verification methods, such as fs-verity, based on a single system-wide policy. The file hash used for verifying the signature and the signature, itself, can be included in the IMA measurement list" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: ima_api: Use struct_size() in kzalloc() ima: use struct_size() in kzalloc() sefltest/ima: support appended signatures (modsig) ima: Fix use after free in ima_read_modsig() MODSIGN: make new include file self contained ima: fix freeing ongoing ahash_request ima: always return negative code for error ima: Store the measurement again when appraising a modsig ima: Define ima-modsig template ima: Collect modsig ima: Implement support for module-style appended signatures ima: Factor xattr_verify() out of ima_appraise_measurement() ima: Add modsig appraise_type option for module-style appended signatures integrity: Select CONFIG_KEYS instead of depending on it PKCS#7: Introduce pkcs7_get_digest() PKCS#7: Refactor verify_pkcs7_signature() MODSIGN: Export module signature definitions ima: initialize the "template" field with the default template
2019-09-28Merge tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linuxLinus Torvalds5-8/+53
Pull nfsd updates from Bruce Fields: "Highlights: - Add a new knfsd file cache, so that we don't have to open and close on each (NFSv2/v3) READ or WRITE. This can speed up read and write in some cases. It also replaces our readahead cache. - Prevent silent data loss on write errors, by treating write errors like server reboots for the purposes of write caching, thus forcing clients to resend their writes. - Tweak the code that allocates sessions to be more forgiving, so that NFSv4.1 mounts are less likely to hang when a server already has a lot of clients. - Eliminate an arbitrary limit on NFSv4 ACL sizes; they should now be limited only by the backend filesystem and the maximum RPC size. - Allow the server to enforce use of the correct kerberos credentials when a client reclaims state after a reboot. And some miscellaneous smaller bugfixes and cleanup" * tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux: (34 commits) sunrpc: clean up indentation issue nfsd: fix nfs read eof detection nfsd: Make nfsd_reset_boot_verifier_locked static nfsd: degraded slot-count more gracefully as allocation nears exhaustion. nfsd: handle drc over-allocation gracefully. nfsd: add support for upcall version 2 nfsd: add a "GetVersion" upcall for nfsdcld nfsd: Reset the boot verifier on all write I/O errors nfsd: Don't garbage collect files that might contain write errors nfsd: Support the server resetting the boot verifier nfsd: nfsd_file cache entries should be per net namespace nfsd: eliminate an unnecessary acl size limit Deprecate nfsd fault injection nfsd: remove duplicated include from filecache.c nfsd: Fix the documentation for svcxdr_tmpalloc() nfsd: Fix up some unused variable warnings nfsd: close cached files prior to a REMOVE or RENAME that would replace target nfsd: rip out the raparms cache nfsd: have nfsd_test_lock use the nfsd_file cache nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache ...
2019-09-28Merge tag 'virtio-fs-5.4' of ↵Linus Torvalds3-1/+27
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse virtio-fs support from Miklos Szeredi: "Virtio-fs allows exporting directory trees on the host and mounting them in guest(s). This isn't actually a new filesystem, but a glue layer between the fuse filesystem and a virtio based back-end. It's similar in functionality to the existing virtio-9p solution, but significantly faster in benchmarks and has better POSIX compliance. Further permformance improvements can be achieved by sharing the page cache between host and guest, allowing for faster I/O and reduced memory use. Kata Containers have been including the out-of-tree virtio-fs (with the shared page cache patches as well) since version 1.7 as an experimental feature. They have been active in development and plan to switch from virtio-9p to virtio-fs as their default solution. There has been interest from other sources as well. The userspace infrastructure is slated to be merged into qemu once the kernel part hits mainline. This was developed by Vivek Goyal, Dave Gilbert and Stefan Hajnoczi" * tag 'virtio-fs-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: virtio-fs: add virtiofs filesystem virtio-fs: add Documentation/filesystems/virtiofs.rst fuse: reserve values for mapping protocol