kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2019-02-14	Merge tag 'mlx5-fixes-2019-02-13' of ↵	David S. Miller	9	-16/+55
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2019-02-13 This series introduces some fixes to mlx5 driver. For more information please see tag log below. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-14	net/mlx5e: XDP, fix redirect resources availability check	Saeed Mahameed	4	-4/+22
	Currently mlx5 driver creates xdp redirect hw queues unconditionally on netdevice open, This is great until someone starts redirecting XDP traffic via ndo_xdp_xmit on mlx5 device and changes the device configuration at the same time, this might cause crashes, since the other device's napi is not aware of the mlx5 state change (resources un-availability). To fix this we must synchronize with other devices napi's on the system. Added a new flag under mlx5e_priv to determine XDP TX resources are available, set/clear it up when necessary and use synchronize_rcu() when the flag is turned off, so other napi's are in-sync with it, before we actually cleanup the hw resources. The flag is tested prior to committing to transmit on mlx5e_xdp_xmit, and it is sufficient to determine if it safe to transmit or not. The other two internal flags (MLX5E_STATE_OPENED and MLX5E_SQ_STATE_ENABLED) become unnecessary. Thus, they are removed from data path. Fixes: 58b99ee3e3eb ("net/mlx5e: Add support for XDP_REDIRECT in device-out side") Reported-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14	net/mlx5: Fix a compilation warning in events.c	Tariq Toukan	1	-8/+9
	Eliminate the following compilation warning: drivers/net/ethernet/mellanox/mlx5/core/events.c: warning: 'error_str' may be used uninitialized in this function [-Wuninitialized]: => 238:3 Fixes: c2fb3db22d35 ("net/mlx5: Rework handling of port module events") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Mikhael Goikhman <migo@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14	net/mlx5: No command allowed when command interface is not ready	Huy Nguyen	3	-1/+20
	When EEH is injected and PCI bus stalls, mlx5's pci error detect function is called to deactivate the command interface and tear down the device. The issue is that there can be a thread that already passed MLX5_DEVICE_STATE_INTERNAL_ERROR check, it will send the command and stuck in the wait_func. Solution: Add function mlx5_cmd_flush to disable command interface and clear all the pending commands. When device state is set to MLX5_DEVICE_STATE_INTERNAL_ERROR, call mlx5_cmd_flush to ensure all pending threads waiting for firmware commands completion are terminated. Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14	net/mlx5e: Fix NULL pointer derefernce in set channels error flow	Maria Pasechnik	1	-3/+4
	New channels are applied to the priv channels only after they are successfully opened. Then, the indirection table should be built according to the new number of channels. Currently, such build is preformed independently of whether the channels opening is successful, and is not reverted on failure. The bug is caused due to removal of rss params from channels struct and moving it to priv struct. That change cause to independency between channels and rss params. This causes a crash on a later point, when accessing rqn of a non existing channel. This patch fixes it by moving the indirection table build right before switching the priv channels to new channels struct, after the new set of channels was successfully opened. Fixes: bbeb53b8b2c9 ("net/mlx5e: Move RSS params to a dedicated struct") Signed-off-by: Maria Pasechnik <mariap@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-13	Merge tag 'kvm-arm-fixes-for-5.0' of ↵	Paolo Bonzini	22	-168/+303
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/ARM fixes for 5.0: - Fix the way we reset vcpus, plugging the race that could happen on VHE - Fix potentially inconsistent group setting for private interrupts - Don't generate UNDEF when LORegion feature is present - Relax the restriction on using stage2 PUD huge mapping - Turn some spinlocks into raw_spinlocks to help RT compliance
2019-02-13	KVM: nVMX: Restore a preemption timer consistency check	Sean Christopherson	1	-0/+4
	A recently added preemption timer consistency check was unintentionally dropped when the consistency checks were being reorganized to match the SDM's ordering. Fixes: 461b4ba4c7ad ("KVM: nVMX: Move the checks for VM-Execution Control Fields to a separate helper function") Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-02-13	Merge tag 'trace-v5.0-rc4' of ↵	Linus Torvalds	1	-2/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "This fixes kprobes/uprobes dynamic processing of strings, where it processes the args but does not update the remaining length of the buffer that the string arguments will be placed in. It constantly passes in the total size of buffer used instead of passing in the remaining size of the buffer used. This could cause issues if the strings are larger than the max size of an event which could cause the strings to be written beyond what was reserved on the buffer" * tag 'trace-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: probeevent: Correctly update remaining space in dynamic area
2019-02-13	netfilter: nft_compat: use-after-free when deleting targets	Pablo Neira Ayuso	1	-1/+2
	Fetch pointer to module before target object is released. Fixes: 29e3880109e3 ("netfilter: nf_tables: fix use-after-free when deleting compat expressions") Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-13	drm/i915: Assert that VED and ISP are power gated	Ville Syrjälä	2	-0/+61
	As there are no upstream drivers for VED or ISP let's just assert that they are power gated. Otherwise they would prevent s0ix entry. For ISP this is only relevant when it is not exposed as a PCI device and instead is a subordinate of the gunit. When exposed as a PCI device it will be handled by the atomisp2_pm driver. On my VLV FFRD8 board the firmware power gates both of these by default. Let's assume that is always the case and just WARN if we ever encounter something different. Cc: Hans de Goede <hdegoede@redhat.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181129175504.3630-2-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>
2019-02-13	drm/i915: s/PUNIT_REG_DSPFREQ/PUNIT_REG_DSPSSPM/	Ville Syrjälä	4	-17/+17
	Rename the punit display power register to match the spec. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181129175504.3630-1-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>
2019-02-13	drm/amdgpu/psp11: TA firmware is optional (v3)	Alex Deucher	2	-14/+23
	Don't warn or fail if it's missing. v2: handle xgmi case more gracefully. v3: handle older kernels properly Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-02-13	Merge branch 'nvme-5.0' of git://git.infradead.org/nvme into for-linus	Jens Axboe	1	-3/+5
	Pull single NVMe fix from Christoph * 'nvme-5.0' of git://git.infradead.org/nvme: nvme-pci: add missing unlock for reset error
2019-02-13	signal: Restore the stop PTRACE_EVENT_EXIT	Eric W. Biederman	1	-2/+5
	In the middle of do_exit() there is there is a call "ptrace_event(PTRACE_EVENT_EXIT, code);" That call places the process in TACKED_TRACED aka "(TASK_WAKEKILL \| __TASK_TRACED)" and waits for for the debugger to release the task or SIGKILL to be delivered. Skipping past dequeue_signal when we know a fatal signal has already been delivered resulted in SIGKILL remaining pending and TIF_SIGPENDING remaining set. This in turn caused the scheduler to not sleep in PTACE_EVENT_EXIT as it figured a fatal signal was pending. This also caused ptrace_freeze_traced in ptrace_check_attach to fail because it left a per thread SIGKILL pending which is what fatal_signal_pending tests for. This difference in signal state caused strace to report strace: Exit of unknown pid NNNNN ignored Therefore update the signal handling state like dequeue_signal would when removing a per thread SIGKILL, by removing SIGKILL from the per thread signal mask and clearing TIF_SIGPENDING. Acked-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Ivan Delalande <colona@arista.com> Cc: stable@vger.kernel.org Fixes: 35634ffa1751 ("signal: Always notice exiting tasks") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-02-13	drm/i915: Apply rps waitboosting for dma_fence_wait_timeout()	Chris Wilson	7	-98/+44
	As time goes by, usage of generic ioctls such as drm_syncobj and sync_file are on the increase bypassing i915-specific ioctls like GEM_WAIT. Currently, we only apply waitboosting to our driver ioctls as we track the file/client and account the waitboosting to them. However, since commit 7b92c1bd0540 ("drm/i915: Avoid keeping waitboost active for signaling threads"), we no longer have been applying the client ratelimiting on waitboosts and so that information has only been used for debug tracking. Push the application of waitboosting down to the common i915_request_wait, and apply it to all foreign fence waits as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190213092504.25709-1-chris@chris-wilson.co.uk
2019-02-13	x86/a.out: Clear the dump structure initially	Borislav Petkov	1	-2/+4
	dump_thread32() in aout_core_dump() does not clear the user32 structure allocated on the stack as the first thing on function entry. As a result, the dump.u_comm, dump.u_ar0 and dump.signal which get assigned before the clearing, get overwritten. Rename that function to fill_dump() to make it clear what it does and call it first thing. This was caught while staring at a patch by Derek Robson <robsonde@gmail.com>. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Derek Robson <robsonde@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Matz <matz@suse.de> Cc: x86@kernel.org Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20190202005512.3144-1-robsonde@gmail.com
2019-02-13	drm/i915/icl: Add degamma and gamma lut size to gen11 caps	Uma Shankar	1	-1/+2
	Add the degamma and gamma lut sizes to gen11 capability structure. Note: Currently this doesn't account for the extended range gamma entries and this will be addressed with new segmented gamma ABI in a future patch. v2: Reorder the patch as per Maarten's suggestion. v3: Rebase v4: Updated commit message with a note as per Matt's suggestion. v5: No Change. Signed-off-by: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1549893025-21837-6-git-send-email-uma.shankar@intel.com
2019-02-13	drm/i915/icl: Enable pipe output csc	Uma Shankar	3	-19/+126
	GEN11+ onwards an output csc hardware block has been added. This is after the pipe gamma block and is in addition to the legacy pipe CSC block. Primary use case for this block is to convert RGB to YUV in case sink supports YUV. This patch adds supports for the same. v2: This is added after splitting the existing ICL pipe CSC handling. As per Matt's suggestion, made this to co-exist with existing pipe CSC, wherein both can be enabled if a certain usecase arises. v3: Fixed an issue with co-existence of output csc and normal pipe csc, spotted by Matt. Put the csc mode flag enabling to color_check to align with atomic. v4: Fixed macro alignment and checkpatch complaints wrt line over 100 characters limit. Signed-off-by: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1549893025-21837-5-git-send-email-uma.shankar@intel.com
2019-02-13	drm/i915/icl: Enable ICL Pipe CSC block	Uma Shankar	2	-4/+10
	Enable ICL pipe csc hardware. CSC block is enabled in CSC_MODE register instead of PLANE_COLOR_CTL. ToDO: Extend the ABI to accept 32 bit coefficient values instead of 16bit for future platforms. v2: Addressed Maarten's review comments. v3: Addressed Matt's review comments. Removed rmw patterns as suggested by Matt. v4: Addressed Matt's review comments. v5: Addressed Ville's review comments. v6: Separated pipe output csc programming from regular csc. v7: Rebase Signed-off-by: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1549893025-21837-4-git-send-email-uma.shankar@intel.com
2019-02-13	drm/i915/icl: Add icl pipe degamma and gamma support	Uma Shankar	2	-7/+26
	Add support for icl pipe degamma and gamma. v2: Removed a POSTING_READ and corrected the Bit Definition as per Maarten's comments. v3: Addressed Matt's review comments. Removed rmw patterns as suggested by Matt. v4: Fixed Matt's review comments. v5: Corrected macro alignment as per Jani Nikula's comments. Addressed Ville and Matt's review comments. v6: Merged ICL degamma handling with GLK and dropped ICL degamma function as per Ville and Matt's comments. v7: updated gamma_mode state with pre csc gammma and post gamma enabling in intel_color_check to align with atomic. v8: Addressed Maarten's review comments. Signed-off-by: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1549893025-21837-3-git-send-email-uma.shankar@intel.com
2019-02-13	drm/i915/glk: Fix degamma lut programming	Uma Shankar	2	-29/+35
	Fixed the glk degamma lut programming which currently was hard coding a linear lut all the time, making degamma block of glk basically a pass through. Currently degamma lut for glk is assigned as 0 in platform configuration. Updated the same to 33 as per the hardware capability. IGT tests for degamma were getting skipped due to this, spotted by Swati. ToDo: The current gamma/degamm lut ABI has just 16bit for each color component. This is not enough for GLK+, since input precision is increased to 3.16 which will need 19bit entries. v2: Added Matt's RB. v3: Changed uint32_t to u32. v4: Fixed Maarten's review comment Credits-to: Swati Sharma <swati2.sharma@intel.com> Signed-off-by: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1549893025-21837-2-git-send-email-uma.shankar@intel.com
2019-02-13	mmc: meson-gx: fix interrupt name	Martin Blumenstingl	1	-1/+2
	Commit bb364890323cca ("mmc: meson-gx: Free irq in release() callback") changed the _probe code to use request_threaded_irq() instead of devm_request_threaded_irq(). Unfortunately this removes a fallback for the interrupt name: devm_request_threaded_irq() uses the device name as fallback if the given IRQ name is NULL. request_threaded_irq() has no such fallback, thus /proc/interrupts shows "(null)" instead. Explicitly pass the dev_name() so we get the IRQ name shown in /proc/interrupts again. While here, also fix the indentation of the request_threaded_irq() parameter list. Fixes: bb364890323cca ("mmc: meson-gx: Free irq in release() callback") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-13	perf/core: Fix impossible ring-buffer sizes warning	Ingo Molnar	1	-1/+1
	The following commit: 9dff0aa95a32 ("perf/core: Don't WARN() for impossible ring-buffer sizes") results in perf recording failures with larger mmap areas: root@skl:/tmp# perf record -g -a failed to mmap with 12 (Cannot allocate memory) The root cause is that the following condition is buggy: if (order_base_2(size) >= MAX_ORDER) goto fail; The problem is that @size is in bytes and MAX_ORDER is in pages, so the right test is: if (order_base_2(size) >= PAGE_SHIFT+MAX_ORDER) goto fail; Fix it. Reported-by: "Jin, Yao" <yao.jin@linux.intel.com> Bisected-by: Borislav Petkov <bp@alien8.de> Analyzed-by: Peter Zijlstra <peterz@infradead.org> Cc: Julien Thierry <julien.thierry@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Fixes: 9dff0aa95a32 ("perf/core: Don't WARN() for impossible ring-buffer sizes") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-13	scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmd	Bill Kuzeja	1	-2/+2
	In qla2x00_async_tm_cmd, we reference off sp after it has been freed. This caused a panic on a system running a slub debug kernel. Since fcport is passed in anyways, just use that instead. Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Acked-by: Giridhar Malavali <gmalavali@marvell.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-13	scsi: sd: fix entropy gathering for most rotational disks	James Bottomley	1	-3/+9
	The problem is that the default for MQ is not to gather entropy, whereas the default for the legacy queue was always to gather it. The original attempt to fix entropy gathering for rotational disks under MQ added an else branch in sd_read_block_characteristics(). Unfortunately, the entire check isn't reached if the device has no characteristics VPD page. Since this page was only introduced in SBC-3 and its optional anyway, most less expensive rotational disks don't have one, meaning they all stopped gathering entropy when we made MQ the default. In a wholly unrelated change, openssl and openssh won't function until the random number generator is initialised, meaning lots of people have been seeing large delays before they could log into systems with default MQ kernels due to this lack of entropy, because it now can take tens of minutes to initialise the kernel random number generator. The fix is to set the non-rotational and add-randomness flags unconditionally early on in the disk initialization path, so they can be reset only if the device actually reports being non-rotational via the VPD page. Reported-by: Mikael Pettersson <mikpelinux@gmail.com> Fixes: 83e32a591077 ("scsi: sd: Contribute to randomness when running rotational device") Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Xuewei Zhang <xueweiz@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-13	Merge tag 'imx-drm-fixes-2019-02-12' of git://git.pengutronix.de/pza/linux ↵	Dave Airlie	4	-14/+29
	into drm-fixes drm/imx: plane, ldb, and ipu-v3 fixes - Fix CSI register offsets for i.MX51 and i.MX53. - Fix delayed page flip completion events on i.MX6QP due to unexpected behaviour of the PRE when issuing NOP buffer updates to the same buffer address. - Stop throwing errors for plane updates on disabled CRTCs when a userspace process is killed while a plane update is pending. - Add missing of_node_put cleanup in imx_ldb_bind. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Philipp Zabel <p.zabel@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/1549990602.4800.11.camel@pengutronix.de
2019-02-13	csky: Fixup dead loop in show_stack	Guo Ren	1	-0/+4
	When STACKTRACE is enabled, we must pass fp as stack for unwind, otherwise random value in stack will casue a dead loop. Signed-off-by: Guo Ren <ren_guo@c-sky.com> Reported-by: Lu Baoquan <lu.baoquan@intellif.com>
2019-02-13	csky: Fixup io-range page attribute for mmap("/dev/mem")	Guo Ren	2	-0/+19
	Some user space drivers need accessing IO address and IO remap need SO(strong order) page-attribute to make IO operation correct. So we need add SO-page-attr for all non-memory address. Signed-off-by: Guo Ren <ren_guo@c-sky.com> Reported-by: Fan Xiaodong <xiaodong.fan@boyahualu.com>
2019-02-13	csky: coding convention: Use task_stack_page	Guo Ren	2	-3/+4
	Use task_stack_page instead of p->stack to get stack. Follow the coding convention style. Also for init_stack, the same with other archs. Signed-off-by: Guo Ren <ren_guo@c-sky.com>
2019-02-13	csky: Fixup wrong pt_regs size	Guo Ren	1	-1/+2
	The bug is from commit 2054f4af1957 ("csky: bugfix gdb coredump error.") We change the ELF_NGREG to ELF_NGREG - 2 to fit gdb&gcc define, but forgot modify ptrace regset. Now coredump use ELF_NRGEG to parse GPRs and ptrace use pt_regs_regset, so there are two different reg_sets for userspace. Signed-off-by: Guo Ren <ren_guo@c-sky.com>
2019-02-13	csky: Fixup _PAGE_GLOBAL bit for 610 tlb entry	Guo Ren	1	-2/+2
	C-SKY CPU 8xx's _PAGE_GLOBAL is BIT(0), but 610's _PAGE_GLOBAL is BIT(6). Use _PAGE_GLOBAL macro instead of bad magic number. Signed-off-by: Guo Ren <ren_guo@c-sky.com>
2019-02-13	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	7	-28/+21
	Merge fixes from Andrew Morton: "6 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: proc: smaps_rollup: fix pss_locked calculation Rename include/{uapi => }/asm-generic/shmparam.h really Revert "mm: use early_pfn_to_nid in page_ext_init" mm/gup: fix gup_pmd_range() for dax Revert "mm: slowly shrink slabs with a relatively small number of objects" Revert "mm: don't reclaim inodes with many attached pages"
2019-02-13	mm: proc: smaps_rollup: fix pss_locked calculation	Sandeep Patil	1	-8/+14
	The 'pss_locked' field of smaps_rollup was being calculated incorrectly. It accumulated the current pss everytime a locked VMA was found. Fix that by adding to 'pss_locked' the same time as that of 'pss' if the vma being walked is locked. Link: http://lkml.kernel.org/r/20190203065425.14650-1-sspatil@android.com Fixes: 493b0e9d945f ("mm: add /proc/pid/smaps_rollup") Signed-off-by: Sandeep Patil <sspatil@android.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Colascione <dancol@google.com> Cc: <stable@vger.kernel.org> [4.14.x, 4.19.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-13	Rename include/{uapi => }/asm-generic/shmparam.h really	Masahiro Yamada	1	-0/+0
	Commit 36c0f7f0f899 ("arch: unexport asm/shmparam.h for all architectures") is different from the patch I submitted. My patch is this: https://lore.kernel.org/lkml/1546904307-11124-1-git-send-email-yamada.masahiro@socionext.com/T/#u The file renaming part: rename include/{uapi => }/asm-generic/shmparam.h (100%) was lost when it was picked up. I think it was an accident because Andrew did not say anything. Link: http://lkml.kernel.org/r/1549158277-24558-1-git-send-email-yamada.masahiro@socionext.com Fixes: 36c0f7f0f899 ("arch: unexport asm/shmparam.h for all architectures") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-13	Revert "mm: use early_pfn_to_nid in page_ext_init"	Qian Cai	2	-4/+3
	This reverts commit fe53ca54270a ("mm: use early_pfn_to_nid in page_ext_init"). When booting a system with "page_owner=on", start_kernel page_ext_init invoke_init_callbacks init_section_page_ext init_page_owner init_early_allocated_pages init_zones_in_node init_pages_in_zone lookup_page_ext page_to_nid The issue here is that page_to_nid() will not work since some page flags have no node information until later in page_alloc_init_late() due to DEFERRED_STRUCT_PAGE_INIT. Hence, it could trigger an out-of-bounds access with an invalid nid. UBSAN: Undefined behaviour in ./include/linux/mm.h:1104:50 index 7 is out of range for type 'zone [5]' Also, kernel will panic since flags were poisoned earlier with, CONFIG_DEBUG_VM_PGFLAGS=y CONFIG_NODE_NOT_IN_PAGE_FLAGS=n start_kernel setup_arch pagetable_init paging_init sparse_init sparse_init_nid memblock_alloc_try_nid_raw It did not handle it well in init_pages_in_zone() which ends up calling page_to_nid(). page:ffffea0004200000 is uninitialized and poisoned raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) page_owner info is not active (free page?) kernel BUG at include/linux/mm.h:990! RIP: 0010:init_page_owner+0x486/0x520 This means that assumptions behind commit fe53ca54270a ("mm: use early_pfn_to_nid in page_ext_init") are incomplete. Therefore, revert the commit for now. A proper way to move the page_owner initialization to sooner is to hook into memmap initialization. Link: http://lkml.kernel.org/r/20190115202812.75820-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Michal Hocko <mhocko@kernel.org> Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Yang Shi <yang.shi@linaro.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-13	mm/gup: fix gup_pmd_range() for dax	Yu Zhao	1	-1/+2
	For dax pmd, pmd_trans_huge() returns false but pmd_huge() returns true on x86. So the function works as long as hugetlb is configured. However, dax doesn't depend on hugetlb. Link: http://lkml.kernel.org/r/20190111034033.601-1-yuzhao@google.com Signed-off-by: Yu Zhao <yuzhao@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Keith Busch <keith.busch@intel.com> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-13	Revert "mm: slowly shrink slabs with a relatively small number of objects"	Dave Chinner	1	-10/+0
	This reverts commit 172b06c32b9497 ("mm: slowly shrink slabs with a relatively small number of objects"). This change changes the agressiveness of shrinker reclaim, causing small cache and low priority reclaim to greatly increase scanning pressure on small caches. As a result, light memory pressure has a disproportionate affect on small caches, and causes large caches to be reclaimed much faster than previously. As a result, it greatly perturbs the delicate balance of the VFS caches (dentry/inode vs file page cache) such that the inode/dentry caches are reclaimed much, much faster than the page cache and this drives us into several other caching imbalance related problems. As such, this is a bad change and needs to be reverted. [ Needs some massaging to retain the later seekless shrinker modifications.] Link: http://lkml.kernel.org/r/20190130041707.27750-3-david@fromorbit.com Fixes: 172b06c32b9497 ("mm: slowly shrink slabs with a relatively small number of objects") Signed-off-by: Dave Chinner <dchinner@redhat.com> Cc: Wolfgang Walter <linux@stwm.de> Cc: Roman Gushchin <guro@fb.com> Cc: Spock <dairinin@gmail.com> Cc: Rik van Riel <riel@surriel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-13	Revert "mm: don't reclaim inodes with many attached pages"	Dave Chinner	1	-5/+2
	This reverts commit a76cf1a474d7d ("mm: don't reclaim inodes with many attached pages"). This change causes serious changes to page cache and inode cache behaviour and balance, resulting in major performance regressions when combining worklaods such as large file copies and kernel compiles. https://bugzilla.kernel.org/show_bug.cgi?id=202441 This change is a hack to work around the problems introduced by changing how agressive shrinkers are on small caches in commit 172b06c32b94 ("mm: slowly shrink slabs with a relatively small number of objects"). It creates more problems than it solves, wasn't adequately reviewed or tested, so it needs to be reverted. Link: http://lkml.kernel.org/r/20190130041707.27750-2-david@fromorbit.com Fixes: a76cf1a474d7d ("mm: don't reclaim inodes with many attached pages") Signed-off-by: Dave Chinner <dchinner@redhat.com> Cc: Wolfgang Walter <linux@stwm.de> Cc: Roman Gushchin <guro@fb.com> Cc: Spock <dairinin@gmail.com> Cc: Rik van Riel <riel@surriel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-13	Merge tag 'hwmon-for-v5.0-rc7' of ↵	Linus Torvalds	1	-1/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fix from Guenter Roeck: "Fix fan detection for NCT6793D" * tag 'hwmon-for-v5.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (nct6775) Fix fan6 detection for NCT6793D
2019-02-13	Merge branch 'md-fixes' of https://github.com/liu-song-6/linux into for-linus	Jens Axboe	1	-10/+18
	Pull MD fix from Song * 'md-fixes' of https://github.com/liu-song-6/linux: md/raid1: don't clear bitmap bits on interrupted recovery.
2019-02-13	md/raid1: don't clear bitmap bits on interrupted recovery.	Nate Dailey	1	-10/+18
	sync_request_write no longer submits writes to a Faulty device. This has the unfortunate side effect that bitmap bits can be incorrectly cleared if a recovery is interrupted (previously, end_sync_write would have prevented this). This means the next recovery may not copy everything it should, potentially corrupting data. Add a function for doing the proper md_bitmap_end_sync, called from end_sync_write and the Faulty case in sync_request_write. backport note to 4.14: s/md_bitmap_end_sync/bitmap_end_sync Cc: stable@vger.kernel.org 4.14+ Fixes: 0c9d5b127f69 ("md/raid1: avoid reusing a resync bio after error handling.") Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Tested-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Nate Dailey <nate.dailey@stratus.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2019-02-13	Merge tag 'riscv-for-linus-5.0-rc7' of ↵	Linus Torvalds	3	-10/+12
	git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux Pull RISC-V fixes from Palmer Dabbelt: "This contains a pair of bug fixes that I'd like to include in 5.0: - A fix to disambiguate swap from invalid PTEs, which fixes an error when trying to unmap PROT_NONE pages. - A revert to an optimization of the size of flat binaries. This is really a workaround to prevent breaking existing boot flows, but since the change was introduced as part of the 5.0 merge window I'd like to have the fix in before 5.0 so we can avoid a regression for any proper releases. With these I hope we're out of patches for 5.0 in RISC-V land" * tag 'riscv-for-linus-5.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: Revert "RISC-V: Make BSS section as the last section in vmlinux.lds.S" riscv: Add pte bit to distinguish swap from invalid
2019-02-12	NFS: Don't use page_file_mapping after removing the page	Benjamin Coddington	1	-5/+6
	If nfs_page_async_flush() removes the page from the mapping, then we can't use page_file_mapping() on it as nfs_updatepate() is wont to do when receiving an error. Instead, push the mapping to the stack before the page is possibly truncated. Fixes: 8fc75bed96bb ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-02-12	rpc: properly check debugfs dentry before using it	Greg Kroah-Hartman	1	-1/+1
	debugfs can now report an error code if something went wrong instead of just NULL. So if the return value is to be used as a "real" dentry, it needs to be checked if it is an error before dereferencing it. This is now happening because of ff9fb72bc077 ("debugfs: return error values, not NULL"), but why debugfs files are not being created properly is an older issue, probably one that has always been there and should probably be looked at... Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: linux-nfs@vger.kernel.org Cc: netdev@vger.kernel.org Reported-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-02-12	xprtrdma: Make sure Send CQ is allocated on an existing compvec	Nicolas Morey-Chaisemartin	1	-1/+2
	Make sure the device has at least 2 completion vectors before allocating to compvec#1 Fixes: a4699f5647f3 (xprtrdma: Put Send CQ in IB_POLL_WORKQUEUE mode) Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-02-12	team: avoid complex list operations in team_nl_cmd_options_set()	Cong Wang	1	-22/+5
	The current opt_inst_list operations inside team_nl_cmd_options_set() is too complex to track: LIST_HEAD(opt_inst_list); nla_for_each_nested(...) { list_for_each_entry(opt_inst, &team->option_inst_list, list) { if (__team_option_inst_tmp_find(&opt_inst_list, opt_inst)) continue; list_add(&opt_inst->tmp_list, &opt_inst_list); } } team_nl_send_event_options_get(team, &opt_inst_list); as while we retrieve 'opt_inst' from team->option_inst_list, it could be added to the local 'opt_inst_list' for multiple times. The __team_option_inst_tmp_find() doesn't work, as the setter team_mode_option_set() still calls team->ops.exit() which uses ->tmp_list too in __team_options_change_check(). Simplify the list operations by moving the 'opt_inst_list' and team_nl_send_event_options_get() into the nla_for_each_nested() loop so that it can be guranteed that we won't insert a same list entry for multiple times. Therefore, __team_option_inst_tmp_find() can be removed too. Fixes: 4fb0534fb7bb ("team: avoid adding twice the same option to the event list") Fixes: 2fcdb2c9e659 ("team: allow to send multiple set events in one message") Reported-by: syzbot+4d4af685432dc0e56c91@syzkaller.appspotmail.com Reported-by: syzbot+68ee510075cf64260cc4@syzkaller.appspotmail.com Cc: Jiri Pirko <jiri@resnulli.us> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12	Merge branch 'net_sched-some-fixes-for-cls_tcindex'	David S. Miller	1	-32/+48
	Cong Wang says: ==================== net_sched: some fixes for cls_tcindex This patchset contains 3 bug fixes for tcindex filter. Please check each patch for details. v2: fix a compile error in patch 2 drop netns refcnt in patch 1 ==================== Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
2019-02-12	net_sched: fix two more memory leaks in cls_tcindex	Cong Wang	1	-9/+7
	struct tcindex_filter_result contains two parts: struct tcf_exts and struct tcf_result. For the local variable 'cr', its exts part is never used but initialized without being released properly on success path. So just completely remove the exts part to fix this leak. For the local variable 'new_filter_result', it is never properly released if not used by 'r' on success path. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12	net_sched: fix a memory leak in cls_tcindex	Cong Wang	1	-16/+30
	When tcindex_destroy() destroys all the filter results in the perfect hash table, it invokes the walker to delete each of them. However, results with class==0 are skipped in either tcindex_walk() or tcindex_delete(), which causes a memory leak reported by kmemleak. This patch fixes it by skipping the walker and directly deleting these filter results so we don't miss any filter result. As a result of this change, we have to initialize exts->net properly in tcindex_alloc_perfect_hash(). For net-next, we need to consider whether we should initialize ->net in tcf_exts_init() instead, before that just directly test CONFIG_NET_CLS_ACT=y. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-12	net_sched: fix a race condition in tcindex_destroy()	Cong Wang	1	-7/+11
	tcindex_destroy() invokes tcindex_destroy_element() via a walker to delete each filter result in its perfect hash table, and tcindex_destroy_element() calls tcindex_delete() which schedules tcf RCU works to do the final deletion work. Unfortunately this races with the RCU callback __tcindex_destroy(), which could lead to use-after-free as reported by Adrian. Fix this by migrating this RCU callback to tcf RCU work too, as that workqueue is ordered, we will not have use-after-free. Note, we don't need to hold netns refcnt because we don't call tcf_exts_destroy() here. Fixes: 27ce4f05e2ab ("net_sched: use tcf_queue_work() in tcindex filter") Reported-by: Adrian <bugs@abtelecom.ro> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>