summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-06-19selftests/bpf: fix testsAlexei Starovoitov3-20/+24
Fix tests that assumed no loops. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-19bpf: fix callees pruning callersAlexei Starovoitov1-5/+6
The commit 7640ead93924 partially resolved the issue of callees incorrectly pruning the callers. With introduction of bounded loops and jmps_processed heuristic single verifier state may contain multiple branches and calls. It's possible that new verifier state (for future pruning) will be allocated inside callee. Then callee will exit (still within the same verifier state). It will go back to the caller and there R6-R9 registers will be read and will trigger mark_reg_read. But the reg->live for all frames but the top frame is not set to LIVE_NONE. Hence mark_reg_read will fail to propagate liveness into parent and future walking will incorrectly conclude that the states are equivalent because LIVE_READ is not set. In other words the rule for parent/live should be: whenever register parentage chain is set the reg->live should be set to LIVE_NONE. is_state_visited logic already follows this rule for spilled registers. Fixes: 7640ead93924 ("bpf: verifier: make sure callees don't prune with caller differences") Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-19bpf: introduce bounded loopsAlexei Starovoitov2-13/+181
Allow the verifier to validate the loops by simulating their execution. Exisiting programs have used '#pragma unroll' to unroll the loops by the compiler. Instead let the verifier simulate all iterations of the loop. In order to do that introduce parentage chain of bpf_verifier_state and 'branches' counter for the number of branches left to explore. See more detailed algorithm description in bpf_verifier.h This algorithm borrows the key idea from Edward Cree approach: https://patchwork.ozlabs.org/patch/877222/ Additional state pruning heuristics make such brute force loop walk practical even for large loops. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-19bpf: extend is_branch_taken to registersAlexei Starovoitov1-15/+19
This patch extends is_branch_taken() logic from JMP+K instructions to JMP+X instructions. Conditional branches are often done when src and dst registers contain known scalars. In such case the verifier can follow the branch that is going to be taken when program executes. That speeds up the verification and is essential feature to support bounded loops. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-19selftests/bpf: fix tests due to const spill/fillAlexei Starovoitov2-14/+17
fix tests that incorrectly assumed that the verifier cannot track constants through stack. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-19bpf: track spill/fill of constantsAlexei Starovoitov1-25/+65
Compilers often spill induction variables into the stack, hence it is necessary for the verifier to track scalar values of the registers through stack slots. Also few bpf programs were incorrectly rejected in the past, since the verifier was not able to track such constants while they were used to compute offsets into packet headers. Tracking constants through the stack significantly decreases the chances of state pruning, since two different constants are considered to be different by state equivalency. End result that cilium tests suffer serious degradation in the number of states processed and corresponding verification time increase. before after bpf_lb-DLB_L3.o 1838 6441 bpf_lb-DLB_L4.o 3218 5908 bpf_lb-DUNKNOWN.o 1064 1064 bpf_lxc-DDROP_ALL.o 26935 93790 bpf_lxc-DUNKNOWN.o 34439 123886 bpf_netdev.o 9721 31413 bpf_overlay.o 6184 18561 bpf_lxc_jit.o 39389 359445 After further debugging turned out that cillium progs are getting hurt by clang due to the same constant tracking issue. Newer clang generates better code by spilling less to the stack. Instead it keeps more constants in the registers which hurts state pruning since the verifier already tracks constants in the registers: old clang new clang (no spill/fill tracking introduced by this patch) bpf_lb-DLB_L3.o 1838 1923 bpf_lb-DLB_L4.o 3218 3077 bpf_lb-DUNKNOWN.o 1064 1062 bpf_lxc-DDROP_ALL.o 26935 166729 bpf_lxc-DUNKNOWN.o 34439 174607 bpf_netdev.o 9721 8407 bpf_overlay.o 6184 5420 bpf_lcx_jit.o 39389 39389 The final table is depressing: old clang old clang new clang new clang const spill/fill const spill/fill bpf_lb-DLB_L3.o 1838 6441 1923 8128 bpf_lb-DLB_L4.o 3218 5908 3077 6707 bpf_lb-DUNKNOWN.o 1064 1064 1062 1062 bpf_lxc-DDROP_ALL.o 26935 93790 166729 380712 bpf_lxc-DUNKNOWN.o 34439 123886 174607 440652 bpf_netdev.o 9721 31413 8407 31904 bpf_overlay.o 6184 18561 5420 23569 bpf_lxc_jit.o 39389 359445 39389 359445 Tracking constants in the registers hurts state pruning already. Adding tracking of constants through stack hurts pruning even more. The later patch address this general constant tracking issue with coarse/precise logic. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-19libbpf: constify getter APIsAndrii Nakryiko2-70/+72
Add const qualifiers to bpf_object/bpf_program/bpf_map arguments for getter APIs. There is no need for them to not be const pointers. Verified that make -C tools/lib/bpf make -C tools/testing/selftests/bpf make -C tools/perf all build without warnings. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-19apparmor: reset pos on failure to unpack for various functionsMike Salvatore1-8/+39
Each function that manipulates the aa_ext struct should reset it's "pos" member on failure. This ensures that, on failure, no changes are made to the state of the aa_ext struct. There are paths were elements are optional and the error path is used to indicate the optional element is not present. This means instead of just aborting on error the unpack stream can become unsynchronized on optional elements, if using one of the affected functions. Cc: stable@vger.kernel.org Fixes: 736ec752d95e ("AppArmor: policy routines for loading and unpacking policy") Signed-off-by: Mike Salvatore <mike.salvatore@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-19apparmor: enforce nullbyte at end of tag stringJann Horn1-1/+1
A packed AppArmor policy contains null-terminated tag strings that are read by unpack_nameX(). However, unpack_nameX() uses string functions on them without ensuring that they are actually null-terminated, potentially leading to out-of-bounds accesses. Make sure that the tag string is null-terminated before passing it to strcmp(). Cc: stable@vger.kernel.org Fixes: 736ec752d95e ("AppArmor: policy routines for loading and unpacking policy") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-19apparmor: fix PROFILE_MEDIATES for untrusted inputJohn Johansen1-1/+10
While commit 11c236b89d7c2 ("apparmor: add a default null dfa") ensure every profile has a policy.dfa it does not resize the policy.start[] to have entries for every possible start value. Which means PROFILE_MEDIATES is not safe to use on untrusted input. Unforunately commit b9590ad4c4f2 ("apparmor: remove POLICY_MEDIATES_SAFE") did not take into account the start value usage. The input string in profile_query_cb() is user controlled and is not properly checked to be within the limited start[] entries, even worse it can't be as userspace policy is allowed to make us of entries types the kernel does not know about. This mean usespace can currently cause the kernel to access memory up to 240 entries beyond the start array bounds. Cc: stable@vger.kernel.org Fixes: b9590ad4c4f2 ("apparmor: remove POLICY_MEDIATES_SAFE") Signed-off-by: John Johansen <john.johansen@canonical.com>
2019-06-18RDMA/efa: Handle mmap insertions overflowGal Pressman1-5/+16
When inserting a new mmap entry to the xarray we should check for 'mmap_page' overflow as it is limited to 32 bits. Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation") Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-18Merge branch 'md-fixes' of https://github.com/liu-song-6/linux into for-linusJens Axboe1-14/+22
Pull MD fix from Song. * 'md-fixes' of https://github.com/liu-song-6/linux: md: fix for divide error in status_resync
2019-06-18Merge tag 'for-5.2-rc5-tag' of ↵Linus Torvalds4-16/+21
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - regression where properties stored as xattrs are not properly persisted - a small readahead fix (the fstests testcase for that fix hangs on unpatched kernel, so we'd like get it merged to ease future testing) - fix a race during block group creation and deletion * tag 'for-5.2-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix failure to persist compression property xattr deletion on fsync btrfs: start readahead also in seed devices Btrfs: fix race between block group removal and block group allocation
2019-06-18Merge tag 'armsoc-fixes' of ↵Linus Torvalds49-63/+127
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Olof Johansson: "I've been bad at collecting fixes this release cycle, so this is a fairly large batch that's been trickling in for a while. It's the usual mix, more or less. Some of the bigger things fixed: - Voltage fix for MMC on TI DRA7 that sometimes would overvoltage cards - Regression fixes for D_CAN on am355x - i.MX6SX cpuidle fix to deal with wakeup latency (dropped uart chars) - DT fixes for some DRA7 variants that don't share the superset of blocks on the chip plus the usual mix of stuff: minor build/warning fixes, Kconfig dependencies, and some DT fixlets" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (28 commits) soc: ixp4xx: npe: Fix an IS_ERR() vs NULL check in probe ARM: ixp4xx: include irqs.h where needed ARM: ixp4xx: mark ixp4xx_irq_setup as __init ARM: ixp4xx: don't select SERIAL_OF_PLATFORM firmware: trusted_foundations: add ARMv7 dependency MAINTAINERS: Change QCOM repo location ARM: davinci: da8xx: specify dma_coherent_mask for lcdc ARM: davinci: da850-evm: call regulator_has_full_constraints() ARM: mvebu_v7_defconfig: fix Ethernet on Clearfog ARM: dts: am335x phytec boards: Fix cd-gpios active level ARM: dts: dra72x: Disable usb4_tm target module arm64: arch_k3: Fix kconfig dependency warning ARM: dts: Drop bogus CLKSEL for timer12 on dra7 MAINTAINERS: Update Stefan Wahren email address ARM: dts: bcm: Add missing device_type = "memory" property soc: bcm: brcmstb: biuctrl: Register writes require a barrier soc: brcmstb: Fix error path for unsupported CPUs ARM: dts: dra71x: Disable usb4_tm target module ARM: dts: dra71x: Disable rtc target module ARM: dts: dra76x: Disable usb4_tm target module ...
2019-06-18tun: wake up waitqueues after IFF_UP is setFei Li1-10/+9
Currently after setting tap0 link up, the tun code wakes tx/rx waited queues up in tun_net_open() when .ndo_open() is called, however the IFF_UP flag has not been set yet. If there's already a wait queue, it would fail to transmit when checking the IFF_UP flag in tun_sendmsg(). Then the saving vhost_poll_start() will add the wq into wqh until it is waken up again. Although this works when IFF_UP flag has been set when tun_chr_poll detects; this is not true if IFF_UP flag has not been set at that time. Sadly the latter case is a fatal error, as the wq will never be waken up in future unless later manually setting link up on purpose. Fix this by moving the wakeup process into the NETDEV_UP event notifying process, this makes sure IFF_UP has been set before all waited queues been waken up. Signed-off-by: Fei Li <lifei.shirley@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18qed: Fix -Wmaybe-uninitialized false positiveArnd Bergmann1-0/+1
A previous attempt to shut up the uninitialized variable use warning was apparently insufficient. When CONFIG_PROFILE_ANNOTATED_BRANCHES is set, gcc-8 still warns, because the unlikely() check in DP_NOTICE() causes it to no longer track the state of all variables correctly: drivers/net/ethernet/qlogic/qed/qed_dev.c: In function 'qed_llh_set_ppfid_affinity': drivers/net/ethernet/qlogic/qed/qed_dev.c:798:47: error: 'abs_ppfid' may be used uninitialized in this function [-Werror=maybe-uninitialized] addr = NIG_REG_PPF_TO_ENGINE_SEL + abs_ppfid * 0x4; ~~~~~~~~~~^~~~~ This is not a nice workaround, but always initializing the output from qed_llh_abs_ppfid() at least shuts up the false positive reliably. Fixes: 79284adeb99e ("qed: Add llh ppfid interface and 100g support for offload protocols") Fixes: 8e2ea3ea9625 ("qed: Fix static checker warning") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18ps3_gelic: Use [] to denote a flexible array memberGeert Uytterhoeven1-1/+1
Flexible array members should be denoted using [] instead of [0], else gcc will not warn when they are no longer at the end of a struct. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18Merge tag 'meminit-v5.2-rc6' of ↵Linus Torvalds1-6/+15
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull stack init fix from Kees Cook: "This is a small update to the stack auto-initialization self-test code to deal with the Clang initialization pattern. It's been in linux-next for a couple weeks; I had waited a bit wondering if anything more substantial was going to show up, but nothing has, so I'm sending this now before it gets too late" * tag 'meminit-v5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lib/test_stackinit: Handle Clang auto-initialization pattern
2019-06-18ipoib: show VF broadcast addressDenis Kirjanov2-0/+10
in IPoIB case we can't see a VF broadcast address for but can see for PF Before: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 MAC 14:80:00:00:66:fe, spoof checking off, link-state disable, trust off, query_rss off ... After: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off v1->v2: add the IFLA_VF_BROADCAST constant v2->v3: put IFLA_VF_BROADCAST at the end to avoid KABI breakage and set NLA_REJECT dev_setlink Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18ipoib: correcly show a VF hardware addressDenis Kirjanov1-0/+1
in the case of IPoIB with SRIOV enabled hardware ip link show command incorrecly prints 0 instead of a VF hardware address. Before: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state disable, trust off, query_rss off ... After: 11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP mode DEFAULT group default qlen 256 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, link-state disable, trust off, query_rss off v1->v2: just copy an address without modifing ifla_vf_mac v2->v3: update the changelog Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18drm: return -EFAULT if copy_to_user() failsDan Carpenter2-2/+8
The copy_from_user() function returns the number of bytes remaining to be copied but we want to return a negative error code. Otherwise the callers treat it as a successful copy. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190618131843.GA29463@mwanda
2019-06-18net: remove duplicate fetch in sock_getsockoptJingYi Hou1-3/+0
In sock_getsockopt(), 'optlen' is fetched the first time from userspace. 'len < 0' is then checked. Then in condition 'SO_MEMINFO', 'optlen' is fetched the second time from userspace. If change it between two fetches may cause security problems or unexpected behaivor, and there is no reason to fetch it a second time. To fix this, we need to remove the second fetch. Signed-off-by: JingYi Hou <houjingyi647@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18tipc: fix issues with early FAILOVER_MSG from peerTuong Lien2-4/+7
It appears that a FAILOVER_MSG can come from peer even when the failure link is resetting (i.e. just after the 'node_write_unlock()'...). This means the failover procedure on the node has not been started yet. The situation is as follows: node1 node2 linkb linka linka linkb | | | | | | x failure | | | RESETTING | | | | | | x failure RESET | | RESETTING FAILINGOVER | | | (FAILOVER_MSG) | | |<-------------------------------------------------| | *FAILINGOVER | | | | | (dummy FAILOVER_MSG) | | |------------------------------------------------->| | RESET | | FAILOVER_END | FAILINGOVER RESET | . . . . . . . . . . . . Once this happens, the link failover procedure will be triggered wrongly on the receiving node since the node isn't in FAILINGOVER state but then another link failover will be carried out. The consequences are: 1) A peer might get stuck in FAILINGOVER state because the 'sync_point' was set, reset and set incorrectly, the criteria to end the failover would not be met, it could keep waiting for a message that has already received. 2) The early FAILOVER_MSG(s) could be queued in the link failover deferdq but would be purged or not pulled out because the 'drop_point' was not set correctly. 3) The early FAILOVER_MSG(s) could be dropped too. 4) The dummy FAILOVER_MSG could make the peer leaving FAILINGOVER state shortly, but later on it would be restarted. The same situation can also happen when the link is in PEER_RESET state and a FAILOVER_MSG arrives. The commit resolves the issues by forcing the link down immediately, so the failover procedure will be started normally (which is the same as when receiving a FAILOVER_MSG and the link is in up state). Also, the function "tipc_node_link_failover()" is toughen to avoid such a situation from happening. Acked-by: Jon Maloy <jon.maloy@ericsson.se> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18bnx2x: Check if transceiver implements DDM before accessMauro S. M. Rodrigues2-1/+3
Some transceivers may comply with SFF-8472 even though they do not implement the Digital Diagnostic Monitoring (DDM) interface described in the spec. The existence of such area is specified by the 6th bit of byte 92, set to 1 if implemented. Currently, without checking this bit, bnx2x fails trying to read sfp module's EEPROM with the follow message: ethtool -m enP5p1s0f1 Cannot get Module EEPROM data: Input/output error Because it fails to read the additional 256 bytes in which it is assumed to exist the DDM data. This issue was noticed using a Mellanox Passive DAC PN 01FT738. The EEPROM data was confirmed by Mellanox as correct and similar to other Passive DACs from other manufacturers. Signed-off-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com> Acked-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18Merge branch 'mlxsw-Improve-IPv6-route-insertion-rate'David S. Miller6-68/+567
Ido Schimmel says: ==================== mlxsw: Improve IPv6 route insertion rate Unlike IPv4, an IPv6 multipath route in the kernel is composed from multiple sibling routes, each representing a single nexthop. Therefore, an addition of a multipath route with N nexthops translates to N in-kernel notifications. This is inefficient for device drivers that need to program the route to the underlying device. Each time a new nexthop is appended, a new nexthop group needs to be constructed and the old one deleted. This patchset improves the situation by sending a single notification for a multipath route addition / deletion instead of one per-nexthop. When adding thousands of multipath routes with 16 nexthops, I measured an improvement of about x10 in the insertion rate. Patches #1-#3 add a flag that indicates that in-kernel notifications need to be suppressed and extend the IPv6 FIB notification info with information about the number of sibling routes that are being notified. Patches #4-#5 adjust the two current listeners to these notifications to ignore notifications about IPv6 multipath routes. Patches #6-#7 adds add / delete notifications for IPv6 multipath routes. Patches #8-#14 do the same for mlxsw. Patch #15 finally removes the limitations added in patches #4-#5 and stops the kernel from sending a notification for each added / deleted nexthop. Patch #16 adds test cases. v2 (David Ahern): * Remove patch adjusting netdevsim to consume resources for each fib6_info. Instead, consume one resource for the entire multipath route * Remove 'multipath_rt' usage in patch #10 * Remove 'multipath_rt' from 'struct fib6_entry_notifier_info' in patch #15. The member is only removed in this patch to prevent drivers from processing multipath routes twice during the series ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18selftests: mlxsw: Add a test for FIB offload indicationIdo Schimmel1-0/+349
Test that the offload indication for unicast routes is correctly set in different scenarios. IPv4 support will be added in the future. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18ipv6: Stop sending in-kernel notifications for each nexthopIdo Schimmel4-22/+17
Both listeners - mlxsw and netdevsim - of IPv6 FIB notifications are now ready to handle IPv6 multipath notifications. Therefore, stop ignoring such notifications in both drivers and stop sending notification for each added / deleted nexthop. v2: * Remove 'multipath_rt' from 'struct fib6_entry_notifier_info' Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18mlxsw: spectrum_router: Create IPv6 multipath routes in one goIdo Schimmel1-13/+23
Allow the driver to create an IPv6 multipath route in one go by passing an array of sibling routes and iterating over them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18mlxsw: spectrum_router: Add / delete multiple IPv6 nexthopsIdo Schimmel1-22/+39
Currently, the functions that take care of populating IPv6 nexthop groups only add / delete a single nexthop. Prepare them to handle multiple routes in one notification by passing an array of routes and adding / deleting all of them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18mlxsw: spectrum_router: Pass array of routes to route handling functionsIdo Schimmel1-4/+10
Prepare the driver to handle multiple routes in a single notification by passing an array of routes to the functions that actually add / delete a route. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18mlxsw: spectrum_router: Adjust IPv6 replace logic to new notificationsIdo Schimmel1-7/+7
Previously, IPv6 replace notifications were only sent from fib6_add_rt2node(). The function only emitted such notifications if a route actually replaced another route. A previous patch added another call site in ip6_route_multipath_add() from which such notification can be emitted even if a route was merely added and did not replace another route. Adjust the driver to take this into account and potentially set the 'replace' flag to 'false' if the notified route did not replace an existing route. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18mlxsw: spectrum_router: Pass multiple routes to work itemIdo Schimmel1-7/+65
Prepare the driver to process IPv6 multipath notifications by passing an array of 'struct fib6_info' instead of just one route. A reference is taken on each sibling route in order to prevent them from being freed until they are processed by the workqueue. v2: * Remove 'multipath_rt' usage Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18mlxsw: spectrum_router: Prepare function to return errorsIdo Schimmel1-3/+11
The function mlxsw_sp_router_fib6_event() takes care of preparing the needed information for the work item that actually inserts the route into the device. When processing an IPv6 multipath route, the function will need to allocate an array to store pointers to all the sibling routes. Change the function's signature to return an error code and adjust the single call site. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18mlxsw: spectrum_router: Remove processing of IPv6 append notificationsIdo Schimmel1-2/+0
No such notifications are sent by the IPv6 code, so remove them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18ipv6: Add IPv6 multipath notification for route deleteIdo Schimmel1-0/+6
If all the nexthops of a multipath route are being deleted, send one notification for the entire route, instead of one per-nexthop. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18ipv6: Add IPv6 multipath notifications for add / replaceIdo Schimmel1-0/+15
Emit a notification when a multipath routes is added or replace. Note that unlike the replace notifications sent from fib6_add_rt2node(), it is possible we are sending a 'FIB_EVENT_ENTRY_REPLACE' when a route was merely added and not replaced. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18netdevsim: Ignore IPv6 multipath notificationsIdo Schimmel1-0/+7
In a similar fashion to previous patch, have netdevsim ignore IPv6 multipath notifications for now. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18mlxsw: spectrum_router: Ignore IPv6 multipath notificationsIdo Schimmel1-0/+2
IPv6 multipath notifications are about to be sent, but mlxsw is not ready to process them, so ignore them. The limitation will be lifted by a subsequent patch which will also stop the kernel from sending a notification for each nexthop. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18ipv6: Extend notifier info for multipath routesIdo Schimmel2-0/+24
Extend the IPv6 FIB notifier info with number of sibling routes being notified. This will later allow listeners to process one notification for a multipath routes instead of N, where N is the number of nexthops. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18netlink: Add field to skip in-kernel notificationsIdo Schimmel1-1/+3
The struct includes a 'skip_notify' flag that indicates if netlink notifications to user space should be suppressed. As explained in commit 3b1137fe7482 ("net: ipv6: Change notifications for multipath add to RTA_MULTIPATH"), this is useful to suppress per-nexthop RTM_NEWROUTE notifications when an IPv6 multipath route is added / deleted. Instead, one notification is sent for the entire multipath route. This concept is also useful for in-kernel notifications. Sending one in-kernel notification for the addition / deletion of an IPv6 multipath route - instead of one per-nexthop - provides a significant increase in the insertion / deletion rate to underlying devices. Add a 'skip_notify_kernel' flag to suppress in-kernel notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18netlink: Document all fields of 'struct nl_info'Ido Schimmel1-0/+2
Some fields were not documented. Add documentation. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18Merge branch '40GbE' of ↵David S. Miller5-335/+440
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2019-06-17 This series contains updates to the iavf driver only. Akeem updates the driver to change how VLAN tags are being populated and programmed into the hardware by starting from the first member of the list until the number of allowed VLAN tags is exhausted. Mitch fixed the variable type since the variable counter starts out negative and climbs to zero, so use a signed integer instead of unsigned. Also increase the timeout to avoid erroneous errors. Fixed the driver to be able to handle when the hardware hands us a null receive descriptor with no data attached, yet is still valid. Aleksandr fixes the driver to use GFP_ATOMIC when allocating memory in atomic context. Avinash updates the driver to fix a calculation error in virtchnl regarding the valid length. Jakub does some refactoring of the commands processing the watchdog state machine to reduce the length and complexity of the function. Also decalre watchdog task as delayed work and use a dedicated work queue to service the driver tasks. Paul updated the iavf_process_aq_command to call the necessary functions to be able to clear cloud filter bits that need to be cleared. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18xhci: detect USB 3.2 capable host controllers correctlyMathias Nyman1-5/+15
USB 3.2 capability in a host can be detected from the xHCI Supported Protocol Capability major and minor revision fields. If major is 0x3 and minor 0x20 then the host is USB 3.2 capable. For USB 3.2 capable hosts set the root hub lane count to 2. The Major Revision and Minor Revision fields contain a BCD version number. The value of the Major Revision field is JJh and the value of the Minor Revision field is MNh for version JJ.M.N, where JJ = major revision number, M - minor version number, N = sub-minor version number, e.g. version 3.1 is represented with a value of 0310h. Also fix the extra whitespace printed out when announcing regular SuperSpeed hosts. Cc: <stable@vger.kernel.org> # v4.18+ Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-18usb: xhci: Don't try to recover an endpoint if port is in error state.Mathias Nyman3-1/+28
A USB3 device needs to be reset and re-enumarated if the port it connects to goes to a error state, with link state inactive. There is no use in trying to recover failed transactions by resetting endpoints at this stage. Tests show that in rare cases, after multiple endpoint resets of a roothub port the whole host controller might stop completely. Several retries to recover from transaction error can happen as it can take a long time before the hub thread discovers the USB3 port error and inactive link. We can't reliably detect the port error from slot or endpoint context due to a limitation in xhci, see xhci specs section 4.8.3: "There are several cases where the EP State field in the Output Endpoint Context may not reflect the current state of an endpoint" and "Software should maintain an accurate value for EP State, by tracking it with an internal variable that is driven by Events and Doorbell accesses" Same appears to be true for slot state. set a flag to the corresponding slot if a USB3 roothub port link goes inactive to prevent both queueing new URBs and resetting endpoints. Reported-by: Rapolu Chiranjeevi <chiranjeevi.rapolu@intel.com> Tested-by: Rapolu Chiranjeevi <chiranjeevi.rapolu@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-18mlxsw: spectrum_ptp: Fix compilation on 32-bit ARMShalom Toledo1-3/+2
Compilation on 32-bit ARM fails after commit 992aa864dca0 ("mlxsw: spectrum_ptp: Add implementation for physical hardware clock operations") because of 64-bit division: arm-linux-gnueabi-ld: drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.o: in function `mlxsw_sp1_ptp_phc_settime': spectrum_ptp.c:(.text+0x39c): undefined reference to `__aeabi_uldivmod' Fix by using div_u64(). Fixes: 992aa864dca0 ("mlxsw: spectrum_ptp: Add implementation for physical hardware clock operations") Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18KVM: fix typo in documentationDennis Restle1-1/+1
The documentation mentions a non-existing capability KVM_CAP_USER_MEM.s The right name is KVM_CAP_USER_MEMORY. Signed-off-by: Dennis Restle <derestle@htwg-konstanz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-18drm/panfrost: Make sure a BO is only unmapped when appropriateBoris Brezillon3-1/+11
mmu_ops->unmap() will fail when called on a BO that has not been previously mapped, and the error path in panfrost_ioctl_create_bo() can call drm_gem_object_put_unlocked() (which in turn calls panfrost_mmu_unmap()) on a BO that has not been mapped yet. Keep track of the mapped/unmapped state to avoid such issues. Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190618081343.16927-1-boris.brezillon@collabora.com
2019-06-18md: fix for divide error in status_resyncMariusz Tkaczyk1-14/+22
Stopping external metadata arrays during resync/recovery causes retries, loop of interrupting and starting reconstruction, until it hit at good moment to stop completely. While these retries curr_mark_cnt can be small- especially on HDD drives, so subtraction result can be smaller than 0. However it is casted to uint without checking. As a result of it the status bar in /proc/mdstat while stopping is strange (it jumps between 0% and 99%). The real problem occurs here after commit 72deb455b5ec ("block: remove CONFIG_LBDAF"). Sector_div() macro has been changed, now the divisor is casted to uint32. For db = -8 the divisior(db/32-1) becomes 0. Check if db value can be really counted and replace these macro by div64_u64() inline. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Song Liu <songliubraving@fb.com>
2019-06-18soc: ixp4xx: npe: Fix an IS_ERR() vs NULL check in probeDan Carpenter1-2/+2
The devm_ioremap_resource() function doesn't return NULL, it returns error pointers. Fixes: 0b458d7b10f8 ("soc: ixp4xx: npe: Pass addresses as resources") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2019-06-18arm64/mm: don't initialize pgd_cache twiceMike Rapoport1-2/+1
When PGD_SIZE != PAGE_SIZE, arm64 uses kmem_cache for allocation of PGD memory. That cache was initialized twice: first through pgtable_cache_init() alias and then as an override for weak pgd_cache_init(). Remove the alias from pgtable_cache_init() and keep the only pgd_cache initialization in pgd_cache_init(). Fixes: caa841360134 ("x86/mm: Initialize PGD cache during mm initialization") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>