summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2019-07-25treewide: add "WITH Linux-syscall-note" to SPDX tag of uapi headersMasahiro Yamada17-17/+17
UAPI headers licensed under GPL are supposed to have exception "WITH Linux-syscall-note" so that they can be included into non-GPL user space application code. The exception note is missing in some UAPI headers. Some of them slipped in by the treewide conversion commit b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license"). Just run: $ git show --oneline b24413180f56 -- arch/x86/include/uapi/asm/ I believe they are not intentional, and should be fixed too. This patch was generated by the following script: git grep -l --not -e Linux-syscall-note --and -e SPDX-License-Identifier \ -- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild | while read file do sed -i -e '/[[:space:]]OR[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \ -e '/[[:space:]]or[[:space:]]/s/\(GPL-[^[:space:]]*\)/(\1 WITH Linux-syscall-note)/g' \ -e '/[[:space:]]OR[[:space:]]/!{/[[:space:]]or[[:space:]]/!s/\(GPL-[^[:space:]]*\)/\1 WITH Linux-syscall-note/g}' $file done After this patch is applied, there are 5 UAPI headers that do not contain "WITH Linux-syscall-note". They are kept untouched since this exception applies only to GPL variants. $ git grep --not -e Linux-syscall-note --and -e SPDX-License-Identifier \ -- :arch/*/include/uapi/asm/*.h :include/uapi/ :^*/Kbuild include/uapi/drm/panfrost_drm.h:/* SPDX-License-Identifier: MIT */ include/uapi/linux/batman_adv.h:/* SPDX-License-Identifier: MIT */ include/uapi/linux/qemu_fw_cfg.h:/* SPDX-License-Identifier: BSD-3-Clause */ include/uapi/linux/vbox_err.h:/* SPDX-License-Identifier: MIT */ include/uapi/linux/virtio_iommu.h:/* SPDX-License-Identifier: BSD-3-Clause */ Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-25drm/tinydrm: Move mipi-dbiNoralf Trønnes1-0/+0
This moves mipi-dbi to be a core helper with the name drm_mipi_dbi. Fixup include's in drivers. Move the docs entry and delete tinydrm.rst. Delete the last tinydrm todo entry. v2: Make DRM_MIPI_DBI tristate to enable it being built as a module. Cc: Eric Anholt <eric@anholt.net> Cc: David Lechner <david@lechnology.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Acked-by: David Lechner <david@lechnology.com> Acked-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190722104312.16184-9-noralf@tronnes.org
2019-07-25drm/tinydrm: Split struct mipi_dbi in twoNoralf Trønnes1-38/+91
Split struct mipi_dbi into an interface part and a display pipeline part. The interface part can be used by drivers that need to initialize the controller, but that won't upload the framebuffer over this interface. MIPI DBI supports 3 interface types: - A. Motorola 6800 type parallel bus - B. Intel 8080 type parallel bus - C. SPI type with 3 options: I've embedded the SPI type specifics in the mipi_dbi struct to avoid adding unnecessary complexity. If more interface types will be supported in the future, the type specifics might have to be split out. Rename functions to match the new struct mipi_dbi_dev: - drm_to_mipi_dbi() -> drm_to_mipi_dbi_dev(). - mipi_dbi_init*() -> mipi_dbi_dev_init*(). Cc: Eric Anholt <eric@anholt.net> Cc: David Lechner <david@lechnology.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Acked-by: David Lechner <david@lechnology.com> Acked-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190722104312.16184-5-noralf@tronnes.org
2019-07-25drm/tinydrm: Rename remaining variable mipi -> dbidevNoralf Trønnes1-5/+5
struct mipi_dbi is going to be split into an interface part and a display pipeline part. The interface part can be used by drivers that need to initialize the controller, but that won't upload the framebuffer over this interface. tinydrm uses the variable name 'mipi' but this is not a good name since MIPI refers to a lot of standards. This patch changes the variable name to 'dbidev' where it refers to the pipeline part of struct mipi_dbi. Cc: Eric Anholt <eric@anholt.net> Cc: David Lechner <david@lechnology.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Acked-by: David Lechner <david@lechnology.com> Acked-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190722104312.16184-4-noralf@tronnes.org
2019-07-25drm/tinydrm: Rename variable mipi -> dbiNoralf Trønnes1-10/+10
struct mipi_dbi is going to be split into an interface part and a display pipeline part. The interface part can be used by drivers that need to initialize the controller, but that won't upload the framebuffer over this interface. tinydrm uses the variable name 'mipi' but this is not a good name since MIPI refers to a lot of standards. This patch changes the variable name to 'dbi' where it refers to the interface part of struct mipi_dbi. Functions that use both future parts will have both variables temporarily pointing to the same structure. Cc: Eric Anholt <eric@anholt.net> Cc: David Lechner <david@lechnology.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Acked-by: David Lechner <david@lechnology.com> Acked-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190722104312.16184-3-noralf@tronnes.org
2019-07-24access: avoid the RCU grace period for the temporary subjective credentialsLinus Torvalds1-1/+7
It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU work because it installs a temporary credential that gets allocated and freed for each system call. The allocation and freeing overhead is mostly benign, but because credentials can be accessed under the RCU read lock, the freeing involves a RCU grace period. Which is not a huge deal normally, but if you have a lot of access() calls, this causes a fair amount of seconday damage: instead of having a nice alloc/free patterns that hits in hot per-CPU slab caches, you have all those delayed free's, and on big machines with hundreds of cores, the RCU overhead can end up being enormous. But it turns out that all of this is entirely unnecessary. Exactly because access() only installs the credential as the thread-local subjective credential, the temporary cred pointer doesn't actually need to be RCU free'd at all. Once we're done using it, we can just free it synchronously and avoid all the RCU overhead. So add a 'non_rcu' flag to 'struct cred', which can be set by users that know they only use it in non-RCU context (there are other potential users for this). We can make it a union with the rcu freeing list head that we need for the RCU case, so this doesn't need any extra storage. Note that this also makes 'get_current_cred()' clear the new non_rcu flag, in case we have filesystems that take a long-term reference to the cred and then expect the RCU delayed freeing afterwards. It's not entirely clear that this is required, but it makes for clear semantics: the subjective cred remains non-RCU as long as you only access it synchronously using the thread-local accessors, but you _can_ use it as a generic cred if you want to. It is possible that we should just remove the whole RCU markings for ->cred entirely. Only ->real_cred is really supposed to be accessed through RCU, and the long-term cred copies that nfs uses might want to explicitly re-enable RCU freeing if required, rather than have get_current_cred() do it implicitly. But this is a "minimal semantic changes" change for the immediate problem. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jan Glauber <jglauber@marvell.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com> Cc: Greg KH <greg@kroah.com> Cc: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-2/+2
Pull KVM fixes from Paolo Bonzini: "Bugfixes, a pvspinlock optimization, and documentation moving" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption Documentation: move Documentation/virtual to Documentation/virt KVM: nVMX: Set cached_vmcs12 and cached_shadow_vmcs12 NULL after free KVM: X86: Dynamically allocate user_fpu KVM: X86: Fix fpu state crash in kvm guest Revert "kvm: x86: Use task structs fpu field for user" KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested
2019-07-24Documentation: move Documentation/virtual to Documentation/virtChristoph Hellwig1-2/+2
Renaming docs seems to be en vogue at the moment, so fix on of the grossly misnamed directories. We usually never use "virtual" as a shortcut for virtualization in the kernel, but always virt, as seen in the virt/ top-level directory. Fix up the documentation to match that. Fixes: ed16648eb5b8 ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-24scsi: fcoe: fix a typoChristophe JAILLET1-1/+1
#define relative to FCOE CTLR start with FCOE_CTLR, except FCOE_CTRL_SOL_TOV. This is likely a typo and CTRL should be CTLR here as well. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-23dma-mapping: use dma_get_mask in dma_addressing_limitedEric Auger1-2/+2
We currently have cases where the dma_addressing_limited() gets called with dma_mask unset. This causes a NULL pointer dereference. Use dma_get_mask() accessor to prevent the crash. Fixes: b866455423e0 ("dma-mapping: add a dma_addressing_limited helper") Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-07-23drm/tinydrm: Move tinydrm_display_pipe_init() to mipi-dbiNoralf Trønnes2-27/+10
tinydrm_display_pipe_init() has only one user now, so move it to mipi-dbi. Changes: - Remove drm_connector_helper_funcs.detect, it's always connected. - Store the connector and mode in mipi_dbi instead of it's own struct. Otherwise remove some leftover tinydrm-helpers.h inclusions. Cc: Eric Anholt <eric@anholt.net> Cc: David Lechner <david@lechnology.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-12-noralf@tronnes.org
2019-07-23drm/tinydrm/mipi-dbi: Add mipi_dbi_init_with_formats()Noralf Trønnes1-0/+5
The MIPI DBI standard support more pixel formats than what this helper supports. Add an init function that lets the driver use different format(s). This avoids open coding mipi_dbi_init() in st7586. st7586 sets preferred_depth but this is not necessary since it only supports one format. v2: Forgot to remove the mipi->rotation assignment in st7586, mipi_dbi_init_with_formats() handles it. Cc: David Lechner <david@lechnology.com> Acked-by: David Lechner <david@lechnology.com> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Tested-by: David Lechner <david@lechnology.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-11-noralf@tronnes.org
2019-07-23drm/tinydrm: Move tinydrm_machine_little_endian()Noralf Trønnes1-15/+0
The tinydrm helper is going away so move it into the only user mipi-dbi. Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-9-noralf@tronnes.org
2019-07-23drm/tinydrm: Move tinydrm_spi_transfer()Noralf Trønnes2-5/+3
This is only used by mipi-dbi drivers so move it there. The reason this isn't moved to the SPI subsystem is that it will in a later patch pass a dummy rx buffer for SPI controllers that need this. Low memory boards (64MB) can run into a problem allocating such a "large" contiguous buffer on every transfer after a long up time. This leaves a very specific use case, so we'll keep the function here. mipi-dbi will first go through a refactoring though, before this will be done. Remove SPI todo entry now that we're done with the tinydrm.ko SPI code. v2: Drop moving the mipi_dbi_spi_init() declaration (Sam) Cc: David Lechner <david@lechnology.com> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: : David Lechner <david@lechnology.com> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-8-noralf@tronnes.org
2019-07-23drm/tinydrm: Clean up tinydrm_spi_transfer()Noralf Trønnes1-2/+1
Prep work before moving the function to mipi-dbi. tinydrm_spi_transfer() was made to support one class of drivers in drivers/staging/fbtft that has not been converted to DRM yet, so strip away the unused functionality: - Start byte (header) is not used. - No driver relies on the automatic 16-bit byte swapping on little endian machines with SPI controllers only supporting 8 bits per word. Other changes: - No need to initialize ret - No need for the WARN since mipi-dbi only uses 8 and 16 bpw. - Use spi_message_init_with_transfers() Cc: David Lechner <david@lechnology.com> Acked-by: : David Lechner <david@lechnology.com> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-7-noralf@tronnes.org
2019-07-23drm/tinydrm: Remove tinydrm_spi_max_transfer_size()Noralf Trønnes1-1/+0
spi-bcm2835 can handle >64kB buffers now so there is no need to check ->max_dma_len. The tinydrm_spi_max_transfer_size() max_len argument is not used by any callers, so not needed. Then we have the spi_max module parameter. It was added because staging/fbtft has support for it and there was a report that someone used it to set a small buffer size to avoid popping on a USB soundcard on a Raspberry Pi. In hindsight it shouldn't have been added, I should have waited for it to become a problem first. I don't know it anyone is actually using it, but since tinydrm_spi_transfer() is being moved to mipi-dbi, I'm taking the opportunity to remove it. I'll add it back to mipi-dbi if someone complains. With that out of the way, spi_max_transfer_size() can be used instead. The chosen 16kB buffer size for Type C Option 1 (9-bit) interface is somewhat arbitrary, but a bigger buffer will have a miniscule impact on transfer speed, so it's probably fine. Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-6-noralf@tronnes.org
2019-07-23drm/tinydrm: Remove spi debug buffer dumpingNoralf Trønnes1-25/+0
The SPI event tracing can dump the buffer now so no need for this. Remove the debug print from tinydrm_spi_transfer() since this info can be gleaned from the trace event. Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-5-noralf@tronnes.org
2019-07-23drm/tinydrm: Use spi_is_bpw_supported()Noralf Trønnes1-1/+0
This means that tinydrm_spi_bpw_supported() can be removed. Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-4-noralf@tronnes.org
2019-07-23drm: Add SPI connector typeNoralf Trønnes1-0/+1
tinydrm drivers announce DRM_MODE_CONNECTOR_VIRTUAL for its SPI drivers. Add a SPI connector type to match the actual connector. X will list the connector as Unknown: X.Org X Server 1.19.2 Release Date: 2017-03-02 <...> [ 53523.905] (II) modeset(0): Output Unknown19-1 has no monitor section [ 53523.908] (II) modeset(0): EDID for output Unknown19-1 [ 53523.910] (II) modeset(0): Printing probed modes for output Unknown19-1 [ 53523.911] (II) modeset(0): Modeline "320x240"x0.0 0.00 320 320 320 320 240 240 240 240 (0.0 kHz eP) [ 53523.911] (II) modeset(0): Output Unknown19-1 connected [ 53523.912] (II) modeset(0): Using exact sizes for initial modes [ 53523.912] (II) modeset(0): Output Unknown19-1 using initial mode 320x240 +0+0 The weston source shows that it will be listed as UNNAMED. v2: Split patch in core and driver changes, expand commit message (Daniel) Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: David Lechner <david@lechnology.com> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190719155916.62465-2-noralf@tronnes.org
2019-07-23block: blk-mq: Remove blk_mq_sched_started_request and started_requestMarcos Paulo de Souza1-1/+0
blk_mq_sched_completed_request is a function that checks if the elevator related to the request has started_request implemented, but currently, none of the available IO schedulers implement started_request, so remove both. Signed-off-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-23fbdev: Ditch fb_edid_add_monspecsDaniel Vetter1-3/+0
It's dead code ever since commit 34280340b1dc74c521e636f45cd728f9abf56ee2 Author: Geert Uytterhoeven <geert+renesas@glider.be> Date: Fri Dec 4 17:01:43 2015 +0100 fbdev: Remove unused SH-Mobile HDMI driver Also with this gone we can remove the cea_modes db. This entire thing is massively incomplete anyway, compared to the CEA parsing that drm_edid.c does. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tavis Ormandy <taviso@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190721201956.941-1-daniel.vetter@ffwll.ch
2019-07-23ALSA: compress: Fix regression on compressed capture streamsCharles Keepax1-4/+1
A previous fix to the stop handling on compressed capture streams causes some knock on issues. The previous fix updated snd_compr_drain_notify to set the state back to PREPARED for capture streams. This causes some issues however as the handling for snd_compr_poll differs between the two states and some user-space applications were relying on the poll failing after the stream had been stopped. To correct this regression whilst still fixing the original problem the patch was addressing, update the capture handling to skip the PREPARED state rather than skipping the SETUP state as it has done until now. Fixes: 4f2ab5e1d13d ("ALSA: compress: Fix stop handling on compressed capture streams") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-07-23iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVAJoerg Roedel1-1/+1
The stub function for !CONFIG_IOMMU_IOVA needs to be 'static inline'. Fixes: effa467870c76 ('iommu/vt-d: Don't queue_iova() if there is no flush queue') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-23clk: Add missing documentation of devm_clk_bulk_get_optional() argumentSylwester Nawrocki1-0/+1
Fix an incomplete devm_clk_bulk_get_optional() function documentation by adding description of the num_clks argument as in other *clk_bulk* functions. Fixes: 9bd5ef0bd874 ("clk: Add devm_clk_bulk_get_optional() function") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-07-22Merge branch 'pdf_fixes_v1' of https://git.linuxtv.org/mchehab/experimental ↵Jonathan Corbet1-1/+1
into mauro Bring in a set of post-thrashup fixes from Mauro.
2019-07-22Merge v5.3-rc1 into drm-misc-nextMaxime Ripard1001-10040/+18599
Noralf needs some SPI patches in 5.3 to merge some work on tinydrm. Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
2019-07-22IB/hfi1: Unreserve a flushed OPFN requestKaike Wan1-5/+4
When an OPFN request is flushed, the request is completed without unreserving itself from the send queue. Subsequently, when a new request is post sent, the following warning will be triggered: WARNING: CPU: 4 PID: 8130 at rdmavt/qp.c:1761 rvt_post_send+0x72a/0x880 [rdmavt] Call Trace: [<ffffffffbbb61e41>] dump_stack+0x19/0x1b [<ffffffffbb497688>] __warn+0xd8/0x100 [<ffffffffbb4977cd>] warn_slowpath_null+0x1d/0x20 [<ffffffffc01c941a>] rvt_post_send+0x72a/0x880 [rdmavt] [<ffffffffbb4dcabe>] ? account_entity_dequeue+0xae/0xd0 [<ffffffffbb61d645>] ? __kmalloc+0x55/0x230 [<ffffffffc04e1a4c>] ib_uverbs_post_send+0x37c/0x5d0 [ib_uverbs] [<ffffffffc04e5e36>] ? rdma_lookup_put_uobject+0x26/0x60 [ib_uverbs] [<ffffffffc04dbce6>] ib_uverbs_write+0x286/0x460 [ib_uverbs] [<ffffffffbb6f9457>] ? security_file_permission+0x27/0xa0 [<ffffffffbb641650>] vfs_write+0xc0/0x1f0 [<ffffffffbb64246f>] SyS_write+0x7f/0xf0 [<ffffffffbbb74ddb>] system_call_fastpath+0x22/0x27 This patch fixes the problem by moving rvt_qp_wqe_unreserve() into rvt_qp_complete_swqe() to simplify the code and make it less error-prone. Fixes: ca95f802ef51 ("IB/hfi1: Unreserve a reserved request when it is completed") Link: https://lore.kernel.org/r/20190715164528.74174.31364.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22Merge tag 'media/v5.3-2' of ↵Linus Torvalds1-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "For two regressions in media core: - v4l2-subdev: fix regression in check_pad() - videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already in use" * tag 'media/v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: videodev2.h: change V4L2_PIX_FMT_BGRA444 define: fourcc was already in use media: v4l2-subdev: fix regression in check_pad()
2019-07-22iommu/virtio: Update to most recent specificationJean-Philippe Brucker1-14/+18
Following specification review a few things were changed in v8 of the virtio-iommu series [1], but have been omitted when merging the base driver. Add them now: * Remove the EXEC flag. * Add feature bit for the MMIO flag. * Change domain_bits to domain_range. * Add NOMEM status flag. [1] https://lore.kernel.org/linux-iommu/20190530170929.19366-1-jean-philippe.brucker@arm.com/ Fixes: edcd69ab9a32 ("iommu: Add virtio-iommu driver") Reported-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Acked-by: Joerg Roedel <jroedel@suse.de>
2019-07-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds11-26/+48
Pull networking fixes from David Miller: 1) Several netfilter fixes including a nfnetlink deadlock fix from Florian Westphal and fix for dropping VRF packets from Miaohe Lin. 2) Flow offload fixes from Pablo Neira Ayuso including a fix to restore proper block sharing. 3) Fix r8169 PHY init from Thomas Voegtle. 4) Fix memory leak in mac80211, from Lorenzo Bianconi. 5) Missing NULL check on object allocation in cxgb4, from Navid Emamdoost. 6) Fix scaling of RX power in sfp phy driver, from Andrew Lunn. 7) Check that there is actually an ip header to access in skb->data in VRF, from Peter Kosyh. 8) Remove spurious rcu unlock in hv_netvsc, from Haiyang Zhang. 9) One more tweak the the TCP fragmentation memory limit changes, to be less harmful to applications setting small SO_SNDBUF values. From Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits) tcp: be more careful in tcp_fragment() hv_netvsc: Fix extra rcu_read_unlock in netvsc_recv_callback() vrf: make sure skb->data contains ip header to make routing connector: remove redundant input callback from cn_dev qed: Prefer pcie_capability_read_word() igc: Prefer pcie_capability_read_word() cxgb4: Prefer pcie_capability_read_word() be2net: Synchronize be_update_queues with dev_watchdog bnx2x: Prevent load reordering in tx completion processing net: phy: sfp: hwmon: Fix scaling of RX power net: sched: verify that q!=NULL before setting q->flags chelsio: Fix a typo in a function name allocate_flower_entry: should check for null deref net: hns3: typo in the name of a constant kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist. tipc: Fix a typo mac80211: don't warn about CW params when not using them mac80211: fix possible memory leak in ieee80211_assign_beacon nl80211: fix NL80211_HE_MAX_CAPABILITY_LEN nl80211: fix VENDOR_CMD_RAW_DATA ...
2019-07-22iommu/vt-d: Don't queue_iova() if there is no flush queueDmitry Safonov1-0/+6
Intel VT-d driver was reworked to use common deferred flushing implementation. Previously there was one global per-cpu flush queue, afterwards - one per domain. Before deferring a flush, the queue should be allocated and initialized. Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush queue. It's probably worth to init it for static or unmanaged domains too, but it may be arguable - I'm leaving it to iommu folks. Prevent queuing an iova flush if the domain doesn't have a queue. The defensive check seems to be worth to keep even if queue would be initialized for all kinds of domains. And is easy backportable. On 4.19.43 stable kernel it has a user-visible effect: previously for devices in si domain there were crashes, on sata devices: BUG: spinlock bad magic on CPU#6, swapper/0/1 lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x45/0x115 intel_unmap+0x107/0x113 intel_unmap_sg+0x6b/0x76 __ata_qc_complete+0x7f/0x103 ata_qc_complete+0x9b/0x26a ata_qc_complete_multiple+0xd0/0xe3 ahci_handle_port_interrupt+0x3ee/0x48a ahci_handle_port_intr+0x73/0xa9 ahci_single_level_irq_intr+0x40/0x60 __handle_irq_event_percpu+0x7f/0x19a handle_irq_event_percpu+0x32/0x72 handle_irq_event+0x38/0x56 handle_edge_irq+0x102/0x121 handle_irq+0x147/0x15c do_IRQ+0x66/0xf2 common_interrupt+0xf/0xf RIP: 0010:__do_softirq+0x8c/0x2df The same for usb devices that use ehci-pci: BUG: spinlock bad magic on CPU#0, swapper/0/1 lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x77/0x145 intel_unmap+0x107/0x113 intel_unmap_page+0xe/0x10 usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d usb_hcd_unmap_urb_for_dma+0x17/0x100 unmap_urb_for_dma+0x22/0x24 __usb_hcd_giveback_urb+0x51/0xc3 usb_giveback_urb_bh+0x97/0xde tasklet_action_common.isra.4+0x5f/0xa1 tasklet_action+0x2d/0x30 __do_softirq+0x138/0x2df irq_exit+0x7d/0x8b smp_apic_timer_interrupt+0x10f/0x151 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39 Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: <stable@vger.kernel.org> # 4.14+ Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-07-22futex: Cleanup generic SMP variant of arch_futex_atomic_op_inuser()Vasily Averin1-20/+1
The generic SMP variant of arch_futex_atomic_op_inuser() returns always -ENOSYS so the switch case and surrounding code are pointless. Remove it and just return -ENOSYS. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/12bdaca8-99eb-e576-f842-5970ab1d6a92@virtuozzo.com
2019-07-22blk-mq: allow REQ_NOWAIT to return an error inlineJens Axboe1-1/+4
By default, if a caller sets REQ_NOWAIT and we need to block, we'll return -EAGAIN through the bio->bi_end_io() callback. For some use cases, this makes it hard to use. Allow a caller to ask for inline return of errors related to blocking by also setting REQ_NOWAIT_INLINE. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-22tcp: be more careful in tcp_fragment()Eric Dumazet1-0/+5
Some applications set tiny SO_SNDBUF values and expect TCP to just work. Recent patches to address CVE-2019-11478 broke them in case of losses, since retransmits might be prevented. We should allow these flows to make progress. This patch allows the first and last skb in retransmit queue to be split even if memory limits are hit. It also adds the some room due to the fact that tcp_sendmsg() and tcp_sendpage() might overshoot sk_wmem_queued by about one full TSO skb (64KB size). Note this allowance was already present in stable backports for kernels < 4.15 Note for < 4.15 backports : tcp_rtx_queue_tail() will probably look like : static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk) { struct sk_buff *skb = tcp_send_head(sk); return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk); } Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrew Prout <aprout@ll.mit.edu> Tested-by: Andrew Prout <aprout@ll.mit.edu> Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com> Tested-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Christoph Paasch <cpaasch@apple.com> Cc: Jonathan Looney <jtl@netflix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21connector: remove redundant input callback from cn_devVasily Averin1-1/+0
A small cleanup: this callback is never used. Originally fixed by Stanislav Kinsburskiy <skinsbursky@virtuozzo.com> for OpenVZ7 bug OVZ-6877 cc: stanislav.kinsburskiy@gmail.com Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21kbuild: add net/netfilter/nf_tables_offload.h to header-test blacklist.Jeremy Sowden1-0/+1
net/netfilter/nf_tables_offload.h includes net/netfilter/nf_tables.h which is itself on the blacklist. Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-21drm/fb: remove unused function: drm_gem_fbdev_fb_create()Sam Ravnborg1-7/+0
After migrating several drivers to the generic fbdev emulation there are no users left of drm_gem_fbdev_fb_create. Delete the function. Noticed that there was no callers while browsing around in the drm_fb* code. The code that referenced the function was removed by: commit 13aff184ed9f ("drm/qxl: remove dead qxl fbdev emulation code") The actual use was removed by: commit 26d4707d445d ("drm/qxl: use generic fbdev emulation") v2: - Updated changelog based on feedback from Noralf Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Noralf Trønnes <noralf@tronnes.org> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20190721140610.GA20842@ravnborg.org
2019-07-21Merge tag 'ntb-5.3' of git://github.com/jonmason/ntbLinus Torvalds3-3/+214
Pull NTB updates from Jon Mason: "New feature to add support for NTB virtual MSI interrupts, the ability to test and use this feature in the NTB transport layer. Also, bug fixes for the AMD and Switchtec drivers, as well as some general patches" * tag 'ntb-5.3' of git://github.com/jonmason/ntb: (22 commits) NTB: Describe the ntb_msi_test client in the documentation. NTB: Add MSI interrupt support to ntb_transport NTB: Add ntb_msi_test support to ntb_test NTB: Introduce NTB MSI Test Client NTB: Introduce MSI library NTB: Rename ntb.c to support multiple source files in the module NTB: Introduce functions to calculate multi-port resource index NTB: Introduce helper functions to calculate logical port number PCI/switchtec: Add module parameter to request more interrupts PCI/MSI: Support allocating virtual MSI interrupts ntb_hw_switchtec: Fix setup MW with failure bug ntb_hw_switchtec: Skip unnecessary re-setup of shared memory window for crosslink case ntb_hw_switchtec: Remove redundant steps of switchtec_ntb_reinit_peer() function NTB: correct ntb_dev_ops and ntb_dev comment typos NTB: amd: Silence shift wrapping warning in amd_ntb_db_vector_mask() ntb_hw_switchtec: potential shift wrapping bug in switchtec_ntb_init_sndev() NTB: ntb_transport: Ensure qp->tx_mw_dma_addr is initaliazed NTB: ntb_hw_amd: set peer limit register NTB: ntb_perf: Clear stale values in doorbell and command SPAD register NTB: ntb_perf: Disable NTB link after clearing peer XLAT registers ...
2019-07-20nl80211: fix NL80211_HE_MAX_CAPABILITY_LENJohn Crispin1-1/+1
NL80211_HE_MAX_CAPABILITY_LEN has changed between D2.0 and D4.0. It is now MAC (6) + PHY (11) + MCS (12) + PPE (25) = 54. Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190627095832.19445-1-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-20nl80211: fix VENDOR_CMD_RAW_DATAJohannes Berg1-1/+1
Since ERR_PTR() is an inline, not a macro, just open-code it here so it's usable as an initializer, fixing the build in brcmfmac. Reported-by: Arend Van Spriel <arend.vanspriel@broadcom.com> Fixes: 901bb9891855 ("nl80211: require and validate vendor command policy") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-20Merge tag 'dma-mapping-5.3-1' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds2-0/+23
Pull dma-mapping fixes from Christoph Hellwig: "Fix various regressions: - force unencrypted dma-coherent buffers if encryption bit can't fit into the dma coherent mask (Tom Lendacky) - avoid limiting request size if swiotlb is not used (me) - fix swiotlb handling in dma_direct_sync_sg_for_cpu/device (Fugang Duan)" * tag 'dma-mapping-5.3-1' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: correct the physical addr in dma_direct_sync_sg_for_cpu/device dma-direct: only limit the mapping size if swiotlb could be used dma-mapping: add a dma_addressing_limited helper dma-direct: Force unencrypted DMA under SME for certain DMA masks
2019-07-20Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds3-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Thomas Gleixner: - A collection of objtool fixes which address recent fallout partially exposed by newer toolchains, clang, BPF and general code changes. - Force USER_DS for user stack traces [ Note: the "objtool fixes" are not all to objtool itself, but for kernel code that triggers objtool warnings. Things like missing function size annotations, or code that confuses the unwinder etc. - Linus] * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) objtool: Support conditional retpolines objtool: Convert insn type to enum objtool: Fix seg fault on bad switch table entry objtool: Support repeated uses of the same C jump table objtool: Refactor jump table code objtool: Refactor sibling call detection logic objtool: Do frame pointer check before dead end check objtool: Change dead_end_function() to return boolean objtool: Warn on zero-length functions objtool: Refactor function alias logic objtool: Track original function across branches objtool: Add mcsafe_handle_tail() to the uaccess safe list bpf: Disable GCC -fgcse optimization for ___bpf_prog_run() x86/uaccess: Remove redundant CLACs in getuser/putuser error paths x86/uaccess: Don't leak AC flag into fentry from mcsafe_handle_tail() x86/uaccess: Remove ELF function annotation from copy_user_handle_tail() x86/head/64: Annotate start_cpu0() as non-callable x86/entry: Fix thunk function ELF sizes x86/kvm: Don't call kvm_spurious_fault() from .fixup x86/kvm: Replace vmx_vmenter()'s call to kvm_spurious_fault() with UD2 ...
2019-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-0/+7
Pull more KVM updates from Paolo Bonzini: "Mostly bugfixes, but also: - s390 support for KVM selftests - LAPIC timer offloading to housekeeping CPUs - Extend an s390 optimization for overcommitted hosts to all architectures - Debugging cleanups and improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) KVM: x86: Add fixed counters to PMU filter KVM: nVMX: do not use dangling shadow VMCS after guest reset KVM: VMX: dump VMCS on failed entry KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup KVM: Boost vCPUs that are delivering interrupts KVM: selftests: Remove superfluous define from vmx.c KVM: SVM: Fix detection of AMD Errata 1096 KVM: LAPIC: Inject timer interrupt via posted interrupt KVM: LAPIC: Make lapic timer unpinned KVM: x86/vPMU: reset pmc->counter to 0 for pmu fixed_counters KVM: nVMX: Ignore segment base for VMX memory operand when segment not FS or GS kvm: x86: ioapic and apic debug macros cleanup kvm: x86: some tsc debug cleanup kvm: vmx: fix coccinelle warnings x86: kvm: avoid constant-conversion warning x86: kvm: avoid -Wsometimes-uninitized warning KVM: x86: expose AVX512_BF16 feature to guest KVM: selftests: enable pgste option for the linker on s390 KVM: selftests: Move kvm_create_max_vcpus test to generic code ...
2019-07-20Merge tag 'scsi-fixes' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is the final round of mostly small fixes in our initial submit. It's mostly minor fixes and driver updates. The only change of note is adding a virt_boundary_mask to the SCSI host and host template to parametrise this for NVMe devices instead of having them do a call in slave_alloc. It's a fairly straightforward conversion except in the two NVMe handling drivers that didn't set it who now have a virtual infinity parameter added" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits) scsi: megaraid_sas: set an unlimited max_segment_size scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs scsi: IB/srp: set virt_boundary_mask in the scsi host scsi: IB/iser: set virt_boundary_mask in the scsi host scsi: storvsc: set virt_boundary_mask in the scsi host template scsi: ufshcd: set max_segment_size in the scsi host template scsi: core: take the DMA max mapping size into account scsi: core: add a host / host template field for the virt boundary scsi: core: Fix race on creating sense cache scsi: sd_zbc: Fix compilation warning scsi: libfc: fix null pointer dereference on a null lport scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized scsi: zfcp: fix request object use-after-free in send path causing wrong traces scsi: zfcp: fix request object use-after-free in send path causing seqno errors scsi: megaraid_sas: Update driver version to 07.710.50.00 scsi: megaraid_sas: Add module parameter for FW Async event logging scsi: megaraid_sas: Enable msix_load_balance for Invader and later controllers scsi: megaraid_sas: Fix calculation of target ID scsi: lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE scsi: devinfo: BLIST_TRY_VPD_PAGES for SanDisk Cruzer Blade ...
2019-07-20Merge tag 'kbuild-v5.3-2' of ↵Linus Torvalds1-8/+6
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - match the directory structure of the linux-libc-dev package to that of Debian-based distributions - fix incorrect include/config/auto.conf generation when Kconfig creates it along with the .config file - remove misleading $(AS) from documents - clean up precious tag files by distclean instead of mrproper - add a new coccinelle patch for devm_platform_ioremap_resource migration - refactor module-related scripts to read modules.order instead of $(MODVERDIR)/*.mod files to get the list of created modules - remove MODVERDIR - update list of header compile-test - add -fcf-protection=none flag to avoid conflict with the retpoline flags when CONFIG_RETPOLINE=y - misc cleanups * tag 'kbuild-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits) kbuild: add -fcf-protection=none when using retpoline flags kbuild: update compile-test header list for v5.3-rc1 kbuild: split out *.mod out of {single,multi}-used-m rules kbuild: remove 'prepare1' target kbuild: remove the first line of *.mod files kbuild: create *.mod with full directory path and remove MODVERDIR kbuild: export_report: read modules.order instead of .tmp_versions/*.mod kbuild: modpost: read modules.order instead of $(MODVERDIR)/*.mod kbuild: modsign: read modules.order instead of $(MODVERDIR)/*.mod kbuild: modinst: read modules.order instead of $(MODVERDIR)/*.mod scsi: remove pointless $(MODVERDIR)/$(obj)/53c700.ver kbuild: remove duplication from modules.order in sub-directories kbuild: get rid of kernel/ prefix from in-tree modules.{order,builtin} kbuild: do not create empty modules.order in the prepare stage coccinelle: api: add devm_platform_ioremap_resource script kbuild: compile-test headers listed in header-test-m as well kbuild: remove unused hostcc-option kbuild: remove tag files by distclean instead of mrproper kbuild: add --hash-style= and --build-id unconditionally kbuild: get rid of misleading $(AS) from documents ...
2019-07-20Merge branch 'work.dcache2' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull dcache and mountpoint updates from Al Viro: "Saner handling of refcounts to mountpoints. Transfer the counting reference from struct mount ->mnt_mountpoint over to struct mountpoint ->m_dentry. That allows us to get rid of the convoluted games with ordering of mount shutdowns. The cost is in teaching shrink_dcache_{parent,for_umount} to cope with mixed-filesystem shrink lists, which we'll also need for the Slab Movable Objects patchset" * 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: switch the remnants of releasing the mountpoint away from fs_pin get rid of detach_mnt() make struct mountpoint bear the dentry reference to mountpoint, not struct mount Teach shrink_dcache_parent() to cope with mixed-filesystem shrink lists fs/namespace.c: shift put_mountpoint() to callers of unhash_mnt() __detach_mounts(): lookup_mountpoint() can't return ERR_PTR() anymore nfs: dget_parent() never returns NULL ceph: don't open-code the check for dead lockref
2019-07-20KVM: Boost vCPUs that are delivering interruptsWanpeng Li1-0/+1
Inspired by commit 9cac38dd5d (KVM/s390: Set preempted flag during vcpu wakeup and interrupt delivery), we want to also boost not just lock holders but also vCPUs that are delivering interrupts. Most smp_call_function_many calls are synchronous, so the IPI target vCPUs are also good yield candidates. This patch introduces vcpu->ready to boost vCPUs during wakeup and interrupt delivery time; unlike s390 we do not reuse vcpu->preempted so that voluntarily preempted vCPUs are taken into account by kvm_vcpu_on_spin, but vmx_vcpu_pi_put is not affected (VT-d PI handles voluntary preemption separately, in pi_pre_block). Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM: ebizzy -M vanilla boosting improved 1VM 21443 23520 9% 2VM 2800 8000 180% 3VM 1800 3100 72% Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs, one running ebizzy -M, the other running 'stress --cpu 2': w/ boosting + w/o pv sched yield(vanilla) vanilla boosting improved 1570 4000 155% w/ boosting + w/ pv sched yield(vanilla) vanilla boosting improved 1844 5157 179% w/o boosting, perf top in VM: 72.33% [kernel] [k] smp_call_function_many 4.22% [kernel] [k] call_function_i 3.71% [kernel] [k] async_page_fault w/ boosting, perf top in VM: 38.43% [kernel] [k] smp_call_function_many 6.31% [kernel] [k] async_page_fault 6.13% libc-2.23.so [.] __memcpy_avx_unaligned 4.88% [kernel] [k] call_function_interrupt Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-20KVM: LAPIC: Inject timer interrupt via posted interruptWanpeng Li1-0/+6
Dedicated instances are currently disturbed by unnecessary jitter due to the emulated lapic timers firing on the same pCPUs where the vCPUs reside. There is no hardware virtual timer on Intel for guest like ARM, so both programming timer in guest and the emulated timer fires incur vmexits. This patch tries to avoid vmexit when the emulated timer fires, at least in dedicated instance scenario when nohz_full is enabled. In that case, the emulated timers can be offload to the nearest busy housekeeping cpus since APICv has been found for several years in server processors. The guest timer interrupt can then be injected via posted interrupts, which are delivered by the housekeeping cpu once the emulated timer fires. The host should tuned so that vCPUs are placed on isolated physical processors, and with several pCPUs surplus for busy housekeeping. If disabled mwait/hlt/pause vmexits keep the vCPUs in non-root mode, ~3% redis performance benefit can be observed on Skylake server, and the number of external interrupt vmexits drops substantially. Without patch VM-EXIT Samples Samples% Time% Min Time Max Time Avg time EXTERNAL_INTERRUPT 42916 49.43% 39.30% 0.47us 106.09us 0.71us ( +- 1.09% ) While with patch: VM-EXIT Samples Samples% Time% Min Time Max Time Avg time EXTERNAL_INTERRUPT 6871 9.29% 2.96% 0.44us 57.88us 0.72us ( +- 4.02% ) Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-20net: flow_offload: add flow_block structure and use itPablo Neira Ayuso3-4/+15
This object stores the flow block callbacks that are attached to this block. Update flow_block_cb_lookup() to take this new object. This patch restores the block sharing feature. Fixes: da3eeb904ff4 ("net: flow_offload: add list handling functions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-20net: flow_offload: rename tc_setup_cb_t to flow_setup_cb_tPablo Neira Ayuso3-13/+15
Rename this type definition and adapt users. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>