summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-02-03bridge: uapi: add per vlan tunnel infoRoopa Prabhu3-0/+13
New nested netlink attribute to associate tunnel info per vlan. This is used by bridge driver to send tunnel metadata to bridge ports in vlan tunnel mode. This patch also adds new per port flag IFLA_BRPORT_VLAN_TUNNEL to enable vlan tunnel mode. off by default. One example use for this is a vxlan bridging gateway or vtep which maps vlans to vn-segments (or vnis). User can configure per-vlan tunnel information which the bridge driver can use to bridge vlan into the corresponding vn-segment. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03vxlan: support fdb and learning in COLLECT_METADATA modeRoopa Prabhu2-71/+126
Vxlan COLLECT_METADATA mode today solves the per-vni netdev scalability problem in l3 networks. It expects all forwarding information to be present in dst_metadata. This patch series enhances collect metadata mode to include the case where only vni is present in dst_metadata, and the vxlan driver can then use the rest of the forwarding information datbase to make forwarding decisions. There is no change to default COLLECT_METADATA behaviour. These changes only apply to COLLECT_METADATA when used with the bridging use-case with a special dst_metadata tunnel info flag (eg: where vxlan device is part of a bridge). For all this to work, the vxlan driver will need to now support a single fdb table hashed by mac + vni. This series essentially makes this happen. use-case and workflow: vxlan collect metadata device participates in bridging vlan to vn-segments. Bridge driver above the vxlan device, sends the vni corresponding to the vlan in the dst_metadata. vxlan driver will lookup forwarding database with (mac + vni) for the required remote destination information to forward the packet. Changes introduced by this patch: - allow learning and forwarding database state in vxlan netdev in COLLECT_METADATA mode. Current behaviour is not changed by default. tunnel info flag IP_TUNNEL_INFO_BRIDGE is used to support the new bridge friendly mode. - A single fdb table hashed by (mac, vni) to allow fdb entries with multiple vnis in the same fdb table - rx path already has the vni - tx path expects a vni in the packet with dst_metadata - prior to this series, fdb remote_dsts carried remote vni and the vxlan device carrying the fdb table represented the source vni. With the vxlan device now representing multiple vnis, this patch adds a src vni attribute to the fdb entry. The remote vni already uses NDA_VNI attribute. This patch introduces NDA_SRC_VNI netlink attribute to represent the src vni in a multi vni fdb table. iproute2 example (patched and pruned iproute2 output to just show relevant fdb entries): example shows same host mac learnt on two vni's. before (netdev per vni): $bridge fdb show | grep "00:02:00:00:00:03" 00:02:00:00:00:03 dev vxlan1001 dst 12.0.0.8 self 00:02:00:00:00:03 dev vxlan1000 dst 12.0.0.8 self after this patch with collect metadata in bridged mode (single netdev): $bridge fdb show | grep "00:02:00:00:00:03" 00:02:00:00:00:03 dev vxlan0 src_vni 1001 dst 12.0.0.8 self 00:02:00:00:00:03 dev vxlan0 src_vni 1000 dst 12.0.0.8 self Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03ip_tunnels: new IP_TUNNEL_INFO_BRIDGE flag for ip_tunnel_info modeRoopa Prabhu1-0/+1
New ip_tunnel_info flag to represent bridged tunnel metadata. Used by bridge driver later in the series to pass per vlan dst metadata to bridge ports. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03Merge branch 'ife-to-module'David S. Miller13-90/+276
Yotam Gigi says: ==================== Extract IFE logic to module Extract ife logic from the tc_ife action into an independent module, and make the tc_ife action use it. This way, the ife encapsulation can be used by other modules other than tc_ife action. v1->v2: Fix duplicate symbol error by introducing a new patch that makes the original symbol static. The symbol ife_tlv_meta_extract is exported in act_ife, though not being used by any other module. As the symbol is being moved to the new ife module, introducing the new module creates duplicate symbol. To fix it, add a new patch (1/3) that makes the ife_tlv_meta_extract symbol static in act_ife, thus the symbol does not collide. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03net/sched: act_ife: Change to use ife moduleYotam Gigi4-88/+34
Use the encode/decode functionality from the ife module instead of using implementation inside the act_ife. Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03net: Introduce ife encapsulation moduleYotam Gigi9-0/+242
This module is responsible for the ife encapsulation protocol encode/decode logics. That module can: - ife_encode: encode skb and reserve space for the ife meta header - ife_decode: decode skb and extract the meta header size - ife_tlv_meta_encode - encodes one tlv entry into the reserved ife header space. - ife_tlv_meta_decode - decodes one tlv entry from the packet - ife_tlv_meta_next - advance to the next tlv Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03net/sched: act_ife: Unexport ife_tlv_meta_encodeYotam Gigi2-4/+2
As the function ife_tlv_meta_encode is not used by any other module, unexport it and make it static for the act_ife module. Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03Merge tag 'mmc-v4.10-rc6' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "MMC host: sdhci: Avoid hang when receiving spurious CARD_INT interrupts" * tag 'mmc-v4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci: Ignore unexpected CARD_INT interrupts
2017-02-03acpi, nfit: fix acpi_nfit_flush_probe() crashDan Williams1-1/+5
We queue an on-stack work item to 'nfit_wq' and wait for it to complete as part of a 'flush_probe' request. However, if the user cancels the wait we need to make sure the item is flushed from the queue otherwise we are leaving an out-of-scope stack address on the work list. BUG: unable to handle kernel paging request at ffffbcb3c72f7cd0 IP: [<ffffffffa9413a7b>] __list_add+0x1b/0xb0 [..] RIP: 0010:[<ffffffffa9413a7b>] [<ffffffffa9413a7b>] __list_add+0x1b/0xb0 RSP: 0018:ffffbcb3c7ba7c00 EFLAGS: 00010046 [..] Call Trace: [<ffffffffa90bb11a>] insert_work+0x3a/0xc0 [<ffffffffa927fdda>] ? seq_open+0x5a/0xa0 [<ffffffffa90bb30a>] __queue_work+0x16a/0x460 [<ffffffffa90bbb08>] queue_work_on+0x38/0x40 [<ffffffffc0cf2685>] acpi_nfit_flush_probe+0x95/0xc0 [nfit] [<ffffffffc0cf25d0>] ? nfit_visible+0x40/0x40 [nfit] [<ffffffffa9571495>] wait_probe_show+0x25/0x60 [<ffffffffa9546b30>] dev_attr_show+0x20/0x50 Fixes: 7ae0fa439faf ("nfit, libnvdimm: async region scrub workqueue") Cc: <stable@vger.kernel.org> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-02-03Merge tag 'drm-fixes-for-v4.10-rc7' of ↵Linus Torvalds24-165/+171
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Another fixes pull for v4.10, it's a bit big due to the backport of the VMA fixes for i915 that should fix the oops on shutdown problems that you've worked around. There are also two drm core connector registration fixes, a bunch of nouveau regression fixes and two AMD fixes" * tag 'drm-fixes-for-v4.10-rc7' of git://people.freedesktop.org/~airlied/linux: drm/radeon: Fix vram_size/visible values in DRM_RADEON_GEM_INFO ioctl drm/amdgpu/si: fix crash on headless asics drm/i915: Track pinned vma in intel_plane_state drm/atomic: Unconditionally call prepare_fb. drm/atomic: Fix double free in drm_atomic_state_default_clear drm/nouveau/kms/nv50: request vblank events for commits that send completion events drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215 drm/nouveau/nouveau/led: prevent compiling the led-code if nouveau=y and leds=m drm/nouveau/disp/mcp7x: disable dptmds workaround drm/nouveau: prevent userspace from deleting client object drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers drm: Don't race connector registration drm: prevent double-(un)registration for connectors
2017-02-03Merge tag 'powerpc-4.10-3' of ↵Linus Torvalds11-62/+11
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "The main change is we're reverting the initial stack protector support we merged this cycle. It turns out to not work on toolchains built with libc support, and fixing it will be need to wait for another release. And the rest are all fairly minor: - Some pasemi machines were not booting due to a missing error check in prom_find_boot_cpu() - In EEH we were checking a pointer rather than the bool it pointed to - The clang build was broken by a BUILD_BUG_ON() we added. - The radix (Power9 only) version of map_kernel_page() was broken if our memory size was a multiple of 2MB, which it generally isn't Thanks to: Darren Stevens, Gavin Shan, Reza Arbab" * tag 'powerpc-4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Use the correct pointer when setting a 2MB pte powerpc: Fix build failure with clang due to BUILD_BUG_ON() powerpc: Revert the initial stack protector support powerpc/eeh: Fix wrong flag passed to eeh_unfreeze_pe() powerpc: Add missing error check to prom_find_boot_cpu()
2017-02-03Merge tag 'trace-v4.10-rc2-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Simple fix of s/static struct __init/static __init struct/" * tag 'trace-v4.10-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/kprobes: Fix __init annotation
2017-02-03Merge branch 'modversions' (modversions fixes for powerpc from Ard)Linus Torvalds12-62/+93
Merge kcrctab entry fixes from Ard Biesheuvel: "This is a followup to [0] 'modversions: redefine kcrctab entries as relative CRC pointers', but since relative CRC pointers do not work in modules, and are actually only needed by powerpc with CONFIG_RELOCATABLE=y, I have made it a Kconfig selectable feature instead. First it introduces the MODULE_REL_CRCS Kconfig symbol, and adds the kbuild handling of it, i.e., modpost, genksyms and kallsyms. Then it switches all architectures to 32-bit CRC entries in kcrctab, where all architectures except powerpc with CONFIG_RELOCATABLE=y use absolute ELF symbol references as before" [0] http://marc.info/?l=linux-arch&m=148493613415294&w=2 * emailed patches from Ard Biesheuvel: module: unify absolute krctab definitions for 32-bit and 64-bit modversions: treat symbol CRCs as 32 bit quantities kbuild: modversions: add infrastructure for emitting relative CRCs
2017-02-03log2: make order_base_2() behave correctly on const input value zeroArd Biesheuvel1-1/+12
The function order_base_2() is defined (according to the comment block) as returning zero on input zero, but subsequently passes the input into roundup_pow_of_two(), which is explicitly undefined for input zero. This has gone unnoticed until now, but optimization passes in GCC 7 may produce constant folded function instances where a constant value of zero is passed into order_base_2(), resulting in link errors against the deliberately undefined '____ilog2_NaN'. So update order_base_2() to adhere to its own documented interface. [ See http://marc.info/?l=linux-kernel&m=147672952517795&w=2 and follow-up discussion for more background. The gcc "optimization pass" is really just broken, but now the GCC trunk problem seems to have escaped out of just specially built daily images, so we need to work around it in mainline. - Linus ] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03KVM: x86: do not save guest-unsupported XSAVE stateRadim Krčmář1-0/+1
Saving unsupported state prevents migration when the new host does not support a XSAVE feature of the original host, even if the feature is not exposed to the guest. We've masked host features with guest-visible features before, with 4344ee981e21 ("KVM: x86: only copy XSAVE state for the supported features") and dropped it when implementing XSAVES. Do it again. Fixes: df1daba7d1cb ("KVM: x86: support XSAVES usage in the host") Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-02-03module: unify absolute krctab definitions for 32-bit and 64-bitArd Biesheuvel1-7/+0
The previous patch introduced a separate inline asm version of the krcrctab declaration template for use with 64-bit architectures, which cannot refer to ELF symbols using 32-bit quantities. This declaration should be equivalent to the C one for 32-bit architectures, but just in case - unify them in a separate patch, which can simply be dropped if it turns out to break anything. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03modversions: treat symbol CRCs as 32 bit quantitiesArd Biesheuvel7-52/+53
The modversion symbol CRCs are emitted as ELF symbols, which allows us to easily populate the kcrctab sections by relying on the linker to associate each kcrctab slot with the correct value. This has a couple of downsides: - Given that the CRCs are treated as memory addresses, we waste 4 bytes for each CRC on 64 bit architectures, - On architectures that support runtime relocation, a R_<arch>_RELATIVE relocation entry is emitted for each CRC value, which identifies it as a quantity that requires fixing up based on the actual runtime load offset of the kernel. This results in corrupted CRCs unless we explicitly undo the fixup (and this is currently being handled in the core module code) - Such runtime relocation entries take up 24 bytes of __init space each, resulting in a x8 overhead in [uncompressed] kernel size for CRCs. Switching to explicit 32 bit values on 64 bit architectures fixes most of these issues, given that 32 bit values are not treated as quantities that require fixing up based on the actual runtime load offset. Note that on some ELF64 architectures [such as PPC64], these 32-bit values are still emitted as [absolute] runtime relocatable quantities, even if the value resolves to a build time constant. Since relative relocations are always resolved at build time, this patch enables MODULE_REL_CRCS on powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC references into relative references into .rodata where the actual CRC value is stored. So redefine all CRC fields and variables as u32, and redefine the __CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using inline assembler (which is necessary since 64-bit C code cannot use 32-bit types to hold memory addresses, even if they are ultimately resolved using values that do not exceed 0xffffffff). To avoid potential problems with legacy 32-bit architectures using legacy toolchains, the equivalent C definition of the kcrctab entry is retained for 32-bit architectures. Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y") Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03kbuild: modversions: add infrastructure for emitting relative CRCsArd Biesheuvel5-5/+42
This add the kbuild infrastructure that will allow architectures to emit vmlinux symbol CRCs as 32-bit offsets to another location in the kernel where the actual value is stored. This works around problems with CRCs being mistaken for relocatable symbols on kernels that self relocate at runtime (i.e., powerpc with CONFIG_RELOCATABLE=y) For the kbuild side of things, this comes down to the following: - introducing a Kconfig symbol MODULE_REL_CRCS - adding a -R switch to genksyms to instruct it to emit the CRC symbols as references into the .rodata section - making modpost distinguish such references from absolute CRC symbols by the section index (SHN_ABS) - making kallsyms disregard non-absolute symbols with a __crc_ prefix Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03tcp: add tcp_mss_clamp() helperEric Dumazet5-23/+17
Small cleanup factorizing code doing the TCP_MAXSEG clamping. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03hns_enet: use cpumask_var_t for on-stack maskArnd Bergmann1-10/+15
On large SMP builds, we can run into a build warning: drivers/net/ethernet/hisilicon/hns/hns_enet.c: In function 'hns_set_irq_affinity.isra.27': drivers/net/ethernet/hisilicon/hns/hns_enet.c:1242:1: warning: the frame size of 1032 bytes is larger than 1024 bytes [-Wframe-larger-than=] The solution here is to use cpumask_var_t, which can use dynamic allocation when CONFIG_CPUMASK_OFFSTACK is enabled. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03virtio_net: remove custom busy_pollEric Dumazet1-41/+0
Generic NAPI busy polling allows us to remove custom implementations found in drivers. It is possible further optimization could be done by testing napi_complete_done() return value. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03atl1e: add GRO supportEric Dumazet1-1/+1
It is time to add GRO support to this driver. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03ethtool: do not vzalloc(0) on registers dumpStanislaw Gruszka1-3/+6
If ->get_regs_len() callback return 0, we allocate 0 bytes of memory, what print ugly warning in dmesg, which can be found further below. This happen on mac80211 devices where ieee80211_get_regs_len() just return 0 and driver only fills ethtool_regs structure and actually do not provide any dump. However I assume this can happen on other drivers i.e. when for some devices driver provide regs dump and for others do not. Hence preventing to to print warning in ethtool code seems to be reasonable. ethtool: vmalloc: allocation failure: 0 bytes, mode:0x24080c2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_ZERO) <snip> Call Trace: [<ffffffff813bde47>] dump_stack+0x63/0x8c [<ffffffff811b0a1f>] warn_alloc+0x13f/0x170 [<ffffffff811f0476>] __vmalloc_node_range+0x1e6/0x2c0 [<ffffffff811f0874>] vzalloc+0x54/0x60 [<ffffffff8169986c>] dev_ethtool+0xb4c/0x1b30 [<ffffffff816adbb1>] dev_ioctl+0x181/0x520 [<ffffffff816714d2>] sock_do_ioctl+0x42/0x50 <snip> Mem-Info: active_anon:435809 inactive_anon:173951 isolated_anon:0 active_file:835822 inactive_file:196932 isolated_file:0 unevictable:0 dirty:8 writeback:0 unstable:0 slab_reclaimable:157732 slab_unreclaimable:10022 mapped:83042 shmem:306356 pagetables:9507 bounce:0 free:130041 free_pcp:1080 free_cma:0 Node 0 active_anon:1743236kB inactive_anon:695804kB active_file:3343288kB inactive_file:787728kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:332168kB dirty:32kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 1225424kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no Node 0 DMA free:15900kB min:136kB low:168kB high:200kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15900kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB lowmem_reserve[]: 0 3187 7643 7643 Node 0 DMA32 free:419732kB min:28124kB low:35152kB high:42180kB active_anon:541180kB inactive_anon:248988kB active_file:1466388kB inactive_file:389632kB unevictable:0kB writepending:0kB present:3370280kB managed:3290932kB mlocked:0kB slab_reclaimable:217184kB slab_unreclaimable:4180kB kernel_stack:160kB pagetables:984kB bounce:0kB free_pcp:2236kB local_pcp:660kB free_cma:0kB lowmem_reserve[]: 0 0 4456 4456 Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03ipv6: sr: remove cleanup flag and fix HMAC computationDavid Lebrun3-38/+10
In the latest version of the IPv6 Segment Routing IETF draft [1] the cleanup flag is removed and the flags field length is shrunk from 16 bits to 8 bits. As a consequence, the input of the HMAC computation is modified in a non-backward compatible way by covering the whole octet of flags instead of only the cleanup bit. As such, if an implementation compatible with the latest draft computes the HMAC of an SRH who has other flags set to 1, then the HMAC result would differ from the current implementation. This patch carries those modifications to prevent conflict with other implementations of IPv6 SR. [1] https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-05 Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03cxgb4: Fix uld_send() for ctrl pktsArjun V1-3/+8
Without any uld being loaded, uld_txq_info[] will be NULL. uld_send() is also used for sending control work requests(for eg: setting filter) that dont require any ulds to be loaded. Hence move uld_txq_info[] assignment after ctrl_xmit(). Also added a NULL check for uld_txq_info[]. Fixes: 94cdb8bb993a (cxgb4: Add support for dynamic allocation of resources for ULD). Signed-off-by: Arjun V <arjun@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03dm crypt: replace RCU read-side section with rwsemOndrej Kozina1-4/+4
The lockdep splat below hints at a bug in RCU usage in dm-crypt that was introduced with commit c538f6ec9f56 ("dm crypt: add ability to use keys from the kernel key retention service"). The kernel keyring function user_key_payload() is in fact a wrapper for rcu_dereference_protected() which must not be called with only rcu_read_lock() section mark. Unfortunately the kernel keyring subsystem doesn't currently provide an interface that allows the use of an RCU read-side section. So for now we must drop RCU in favour of rwsem until a proper function is made available in the kernel keyring subsystem. =============================== [ INFO: suspicious RCU usage. ] 4.10.0-rc5 #2 Not tainted ------------------------------- ./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by cryptsetup/6464: #0: (&md->type_lock){+.+.+.}, at: [<ffffffffa02472a2>] dm_lock_md_type+0x12/0x20 [dm_mod] #1: (rcu_read_lock){......}, at: [<ffffffffa02822f8>] crypt_set_key+0x1d8/0x4b0 [dm_crypt] stack backtrace: CPU: 1 PID: 6464 Comm: cryptsetup Not tainted 4.10.0-rc5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014 Call Trace: dump_stack+0x67/0x92 lockdep_rcu_suspicious+0xc5/0x100 crypt_set_key+0x351/0x4b0 [dm_crypt] ? crypt_set_key+0x1d8/0x4b0 [dm_crypt] crypt_ctr+0x341/0xa53 [dm_crypt] dm_table_add_target+0x147/0x330 [dm_mod] table_load+0x111/0x350 [dm_mod] ? retrieve_status+0x1c0/0x1c0 [dm_mod] ctl_ioctl+0x1f5/0x510 [dm_mod] dm_ctl_ioctl+0xe/0x20 [dm_mod] do_vfs_ioctl+0x8e/0x690 ? ____fput+0x9/0x10 ? task_work_run+0x7e/0xa0 ? trace_hardirqs_on_caller+0x122/0x1b0 SyS_ioctl+0x3c/0x70 entry_SYSCALL_64_fastpath+0x18/0xad RIP: 0033:0x7f392c9a4ec7 RSP: 002b:00007ffef6383378 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffef63830a0 RCX: 00007f392c9a4ec7 RDX: 000000000124fcc0 RSI: 00000000c138fd09 RDI: 0000000000000005 RBP: 00007ffef6383090 R08: 00000000ffffffff R09: 00000000012482b0 R10: 2a28205d34383336 R11: 0000000000000246 R12: 00007f392d803a08 R13: 00007ffef63831e0 R14: 0000000000000000 R15: 00007f392d803a0b Fixes: c538f6ec9f56 ("dm crypt: add ability to use keys from the kernel key retention service") Reported-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Ondrej Kozina <okozina@redhat.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-02-03dm rq: cope with DM device destruction while in dm_old_request_fn()Mike Snitzer1-0/+4
Fixes a crash in dm_table_find_target() due to a NULL struct dm_table being passed from dm_old_request_fn() that races with DM device destruction. Reported-by: artem@flashgrid.io Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2017-02-03dm mpath: cleanup -Wbool-operation warning in choose_pgpath()Mike Snitzer1-2/+2
Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-02-03sfc-falcon: get rid of custom busy polling codeEric Dumazet3-169/+1
In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of tricky code, we also remove one lock operation in fast path. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Edward Cree <ecree@solarflare.com> Cc: Bert Kenward <bkenward@solarflare.com> Acked-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03sfc: get rid of custom busy polling codeEric Dumazet3-169/+1
In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of tricky code, we also remove one lock operation in fast path. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Edward Cree <ecree@solarflare.com> Cc: Bert Kenward <bkenward@solarflare.com> Acked-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03Merge remote-tracking branches 'regulator/fix/fixed' and ↵Mark Brown2-47/+1
'regulator/fix/twl6040' into regulator-linus
2017-02-03crypto: chcr - Fix key length for RFC4106Harsh Jain1-2/+2
Check keylen before copying salt to avoid wrap around of Integer. Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03crypto: algif_aead - Fix kernel panic on list_delHarsh Jain1-1/+1
Kernel panics when userspace program try to access AEAD interface. Remove node from Linked List before freeing its memory. Cc: <stable@vger.kernel.org> Signed-off-by: Harsh Jain <harsh@chelsio.com> Reviewed-by: Stephan Müller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03crypto: aesni - Fix failure when pcbc module is absentHerbert Xu1-4/+4
When aesni is built as a module together with pcbc, the pcbc module must be present for aesni to load. However, the pcbc module may not be present for reasons such as its absence on initramfs. This patch allows the aesni to function even if the pcbc module is enabled but not present. Reported-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03crypto: ccp - Fix double add when creating new DMA commandGary R Hook2-1/+6
Eliminate a double-add by creating a new list to manage command descriptors when created; move the descriptor to the pending list when the command is submitted. Cc: <stable@vger.kernel.org> Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03crypto: ccp - Fix DMA operations when IOMMU is enabledGary R Hook1-1/+1
An I/O page fault occurs when the IOMMU is enabled on a system that supports the v5 CCP. DMA operations use a Request ID value that does not match what is expected by the IOMMU, resulting in the I/O page fault. Setting the Request ID value to 0 corrects this issue. Cc: <stable@vger.kernel.org> Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03crypto: chcr - Check device is allocated before useHarsh Jain1-10/+8
Ensure dev is allocated for crypto uld context before using the device for crypto operations. Cc: <stable@vger.kernel.org> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03crypto: chcr - Fix panic on dma_unmap_sgHarsh Jain2-23/+29
Save DMA mapped sg list addresses to request context buffer. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03i40e: add interrupt rate limit verbosityAlan Brady1-3/+10
Due to the resolution of the register controlling interrupt rate limiting, setting certain values for the interrupt rate limit make it appear as though the limiting is not completely accurate. The problem is that the interrupt rate limit is getting rounded down to the nearest multiple of 4. This patch fixes the problem by adding some feedback to the user as to the actual interrupt rate limit being used when it differs from the requested limit. Without this patch setting interrupt rate limits may appear to behave inaccurately. Change-ID: I3093cf3f2d437d35a4c4f4bb5af5ce1b85ab21b7 Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: refactor macro INTRL_USEC_TO_REGAlan Brady3-3/+16
This patch refactors the macro INTRL_USEC_TO_REG into a static inline function and fixes a couple subtle bugs caused by the macro. This patch fixes a bug which was caused by passing a bad register value to the firmware. If enabling interrupt rate limiting, a non-zero value for the rate limit must be used. Otherwise the firmware sets the interrupt rate limit to the maximum value. Due to the limited resolution of the register, attempting to set a value of 1, 2, or 3 would be rounded down to 0 and limiting was left enabled, causing unexpected behavior. This patch also fixes a possible bug in which using the macro itself can introduce unintended side-affects because the macro argument is used more than once in the macro definition (e.g. a variable post-increment argument would perform a double increment on the variable). Without this patch, attempting to set interrupt rate limits of 1, 2, or 3 results in unexpected behavior and future use of this macro could cause subtle bugs. Change-Id: I83ac842de0ca9c86761923d6e3a4d7b1b95f2b3f Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: remove unused functionMitch Williams2-36/+0
After refactoring the client open and close code, this is no longer needed. Remove it. Change-ID: If8e6e32baa354d857c2fd8b2f19404f1786011c4 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: Remove FPK HyperV VF device IDJayaprakash Shanmugam2-2/+0
Requirement for VFs to use the VMBus has been removed that's why removing Hyper-V VF device ID. Change-ID: I84f0964f443ee0db3e5e444b5ace996eb71b8280 Signed-off-by: Jayaprakash Shanmugam <jayaprakash.shanmugam@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: Quick refactor to start moving data off stack and into Tx buffer infoAlexander Duyck2-40/+54
This patch does some quick work to pull some of the data off of the stack and hopefully start storing it in the Tx buffer info section of the Tx ring. Ideally we should be moving away from having to store much of anything on the stack and can just maintain it all in the descriptor rings. Change-ID: I4b4715ea1920e122502482b3f9e56a9a6cb1e9fe Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: remove unnecessary __packedTushar Dave1-2/+2
'struct i40e_dma_mem' defined with 'packed' directive causing kernel unaligned errors on sparc. e.g. i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 1.6.16-k i40e: Copyright (c) 2013 - 2014 Intel Corporation. Kernel unaligned access at TPC[44894c] dma_4v_alloc_coherent+0x1ac/0x300 Kernel unaligned access at TPC[44894c] dma_4v_alloc_coherent+0x1ac/0x300 Kernel unaligned access at TPC[44894c] dma_4v_alloc_coherent+0x1ac/0x300 Kernel unaligned access at TPC[44894c] dma_4v_alloc_coherent+0x1ac/0x300 Kernel unaligned access at TPC[44894c] dma_4v_alloc_coherent+0x1ac/0x300 i40e 0000:03:00.0: fw 5.1.40981 api 1.5 nvm 5.04 0x80002548 0.0.0 This can be fixed with get_unaligned/put_unaligned(). However no reference in driver shows that 'struct i40e_dma_mem' directly shoved into NIC hardware. But instead fields of the struct are being read and used for hardware. Therefore, __packed is unnecessary for 'struct i40e_dma_mem'. In addition, although 'struct i40e_virt_mem' doesn't cause any unaligned access, keeping it packed is unnecessary as well because of aforementioned reason. This change make 'struct i40e_dma_mem' and 'struct i40e_virt_mem' unpacked. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40evf: remove unused device IDMitch Williams1-1/+0
This device ID was intended for use when running Linux VF drivers under Hyper-V, but we have determined that it is not necessary. Since it is unused, and will never be used, remove it. Change-ID: I74998ab4237db043cd400547bb54a0a5e2a37ea5 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: Deprecating unused macroBimmy Pujari4-18/+3
I40E_MAC_X710 was supposed to be for 10G and I40E_MAC_XL710 was supposed to be for 40G. But function i40e_is_mac_710 sets I40E_MAC_XL710 for all device IDS, I40E_MAC_X710 is not used at all. As there is nothing to compare there is no need for this function. Thus deprecating this extra macro and removing this function entirely and replacing it with a direct check. Change-ID: I7d1769954dccd574a290ac04adb836ebd156730e Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: when adding or removing MAC filters, correctly handle VLANsJacob Keller2-10/+7
Instead of using i40e_add_filter or i40e_del_filter directly, when adding a MAC address, we should normally be using i40e_add_mac_filter or i40e_del_mac_filter. These functions correctly handle the various cases of VLAN mode or PVID settings. This ensures consistency and avoids the issues that can occur with the recent addition of a WARN_ON() in i40e_sync_vsi_filters. Change-ID: I7fe62db063391fdd1180b2d6a6a3c5ab4307eeee Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: avoid O(n^2) loop when deleting all filtersJacob Keller3-2/+3
Use __i40e_del_filter instead of using i40e_del_filter() which will avoid doing an additional search to delete a filter we already have the pointer for. Change-ID: Iea5a7e3cafbf8c682ed9d3b6c69cf5ff53f44daf Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: rename i40e_put_mac_in_vlan and i40e_del_mac_all_vlanJacob Keller3-16/+18
These functions purpose is to add a new MAC filter correctly, whether we're using VLANs or not. Their goal is to ensure that all active VLANs get the new MAC filter. Rename them so that their intent is clear. They function correctly regardless of whether we have any active VLANs or only have I40E_VLAN_ANY filters. The new names convey how they function in a more clear manner. Change-ID: Iec1961f968c0223a7132724a74e26a665750b107 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-02-03i40e: no need to check is_vsi_in_vlan before calling i40e_del_mac_all_vlanJacob Keller1-4/+1
This function won't be appreciably slower when in VLAN mode, so there is no real reason to not just call it directly. In either case, we still must search the full table for a MAC/VLAN pair. We do get to stop searching a tiny bit early in the case of knowing we are not in VLAN mode, but this is a minor savings and we can avoid the code complexity by not having to worry about the check. Change-ID: I533412195b3a42f51cf629e3675dd5145aea8625 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>