summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2017-11-03net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpathJiri Pirko1-3/+6
In sch_handle_egress and sch_handle_ingress tp->q is used only in order to update stats. So stats and filter list are the only things that are needed in clsact qdisc fastpath processing. Introduce new mini_Qdisc struct to hold those items. Also, introduce a helper to swap the mini_Qdisc structures in case filter list head changes. This removes need for tp->q usage without added overhead. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-03net: Define eth_stp_addr in linux/etherdevice.hEgil Hjelmeland2-2/+1
The lan9303 driver defines eth_stp_addr as a synonym to eth_reserved_addr_base to get the STP ethernet address 01:80:c2:00:00:00. eth_reserved_addr_base is also used to define the start of Bridge Reserved ethernet address range, which happen to be the STP address. br_dev_setup refer to eth_reserved_addr_base as a definition of STP address. Clean up by: - Move the eth_stp_addr definition to linux/etherdevice.h - Use eth_stp_addr instead of eth_reserved_addr_base in br_dev_setup. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-02security: bpf: replace include of linux/bpf.h with forward declarationsJakub Kicinski1-1/+4
Touching linux/bpf.h makes us rebuild a surprisingly large portion of the kernel. Remove the unnecessary dependency from security.h, it only needs forward declarations. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-3/+2
Smooth Cong Wang's bug fix into 'net-next'. Basically put the bulk of the tcf_block_put() logic from 'net' into tcf_block_put_ext(), but after the offload unbind. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01net: dsa: lan9303: Add STP ALR entry on port 0Egil Hjelmeland1-0/+2
STP BPDUs arriving on user ports must sent to CPU port only, for processing by the SW bridge. Add an ALR entry with STP state override to fix that. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01bpf: reduce verifier memory consumptionAlexei Starovoitov1-3/+13
the verifier got progressively smarter over time and size of its internal state grew as well. Time to reduce the memory consumption. Before: sizeof(struct bpf_verifier_state) = 6520 After: sizeof(struct bpf_verifier_state) = 896 It's done by observing that majority of BPF programs use little to no stack whereas verifier kept all of 512 stack slots ready always. Instead dynamically reallocate struct verifier state when stack access is detected. Runtime difference before vs after is within a noise. The number of processed instructions stays the same. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-31Revert "PM / QoS: Fix device resume latency PM QoS"Rafael J. Wysocki1-3/+2
This reverts commit 0cc2b4e5a020 (PM / QoS: Fix device resume latency PM QoS) as it introduced regressions on multiple systems and the fix-up in commit 2a9a86d5c813 (PM / QoS: Fix default runtime_pm device resume latency) does not address all of them. The original problem that commit 0cc2b4e5a020 was attempting to fix will be addressed later. Fixes: 0cc2b4e5a020 (PM / QoS: Fix device resume latency PM QoS) Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-31Revert "PM / QoS: Fix default runtime_pm device resume latency"Rafael J. Wysocki1-2/+1
This reverts commit 2a9a86d5c813 (PM / QoS: Fix default runtime_pm device resume latency) as the commit it depends on is going to be reverted. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller7-33/+43
Several conflicts here. NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to nfp_fl_output() needed some adjustments because the code block is in an else block now. Parallel additions to net/pkt_cls.h and net/sch_generic.h A bug fix in __tcp_retransmit_skb() conflicted with some of the rbtree changes in net-next. The tc action RCU callback fixes in 'net' had some overlap with some of the recent tcf_block reworking. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-30PM / QoS: Fix default runtime_pm device resume latencyTero Kristo1-1/+2
The recent change to the PM QoS framework to introduce a proper no constraint value overlooked to handle the devices which don't implement PM QoS OPS. Runtime PM is one of the more severely impacted subsystems, failing every attempt to runtime suspend a device. This leads into some nasty second level issues like probe failures and increased power consumption among other things. Fix this by adding a proper return value for devices that don't implement PM QoS. Fixes: 0cc2b4e5a020 (PM / QoS: Fix device resume latency PM QoS) Signed-off-by: Tero Kristo <t-kristo@ti.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds3-19/+21
Pull networking fixes from David Miller: 1) Fix route leak in xfrm_bundle_create(). 2) In mac80211, validate user rate mask before configuring it. From Johannes Berg. 3) Properly enforce memory limits in fair queueing code, from Toke Hoiland-Jorgensen. 4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet. 5) Fix TSO header allocation and management in mvpp2 driver, from Yan Markman. 6) Don't take socket lock in BH handler in strparser code, from Tom Herbert. 7) Don't show sockets from other namespaces in AF_UNIX code, from Andrei Vagin. 8) Fix double free in error path of tap_open(), from Girish Moodalbail. 9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker and Alexander Duyck. 10) Fix DCB mode programming in stmmac driver, from Jose Abreu. 11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin Long. 12) Properly align SKB head before building SKB in tuntap, from Jason Wang. 13) Avoid matching qdiscs with a zero handle during lookups, from Cong Wang. 14) Fix various endianness bugs in sctp, from Xin Long. 15) Fix tc filter callback races and add selftests which trigger the problem, from Cong Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits) selftests: Introduce a new test case to tc testsuite selftests: Introduce a new script to generate tc batch file net_sched: fix call_rcu() race on act_sample module removal net_sched: add rtnl assertion to tcf_exts_destroy() net_sched: use tcf_queue_work() in tcindex filter net_sched: use tcf_queue_work() in rsvp filter net_sched: use tcf_queue_work() in route filter net_sched: use tcf_queue_work() in u32 filter net_sched: use tcf_queue_work() in matchall filter net_sched: use tcf_queue_work() in fw filter net_sched: use tcf_queue_work() in flower filter net_sched: use tcf_queue_work() in flow filter net_sched: use tcf_queue_work() in cgroup filter net_sched: use tcf_queue_work() in bpf filter net_sched: use tcf_queue_work() in basic filter net_sched: introduce a workqueue for RCU callbacks of tc filter sctp: fix some type cast warnings introduced since very beginning sctp: fix a type cast warnings that causes a_rwnd gets the wrong value sctp: fix some type cast warnings introduced by transport rhashtable sctp: fix some type cast warnings introduced by stream reconf ...
2017-10-29sctp: fix some type cast warnings introduced since very beginningXin Long1-1/+1
These warnings were found by running 'make C=2 M=net/sctp/'. They are there since very beginning. Note after this patch, there still one warning left in sctp_outq_flush(): sctp_chunk_fail(chunk, SCTP_ERROR_INV_STRM) Since it has been moved to sctp_stream_outq_migrate on net-next, to avoid the extra job when merging net-next to net, I will post the fix for it after the merging is done. Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29sctp: fix some type cast warnings introduced by stream reconfXin Long1-16/+16
These warnings were found by running 'make C=2 M=net/sctp/'. They are introduced by not aware of Endian when coding stream reconf patches. Since commit c0d8bab6ae51 ("sctp: add get and set sockopt for reconf_enable") enabled stream reconf feature for users, the Fixes tag below would use it. Fixes: c0d8bab6ae51 ("sctp: add get and set sockopt for reconf_enable") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28tap: reference to KVA of an unloaded module causes kernel panicGirish Moodalbail1-2/+2
The commit 9a393b5d5988 ("tap: tap as an independent module") created a separate tap module that implements tap functionality and exports interfaces that will be used by macvtap and ipvtap modules to create create respective tap devices. However, that patch introduced a regression wherein the modules macvtap and ipvtap can be removed (through modprobe -r) while there are applications using the respective /dev/tapX devices. These applications cause kernel to hold reference to /dev/tapX through 'struct cdev macvtap_cdev' and 'struct cdev ipvtap_dev' defined in macvtap and ipvtap modules respectively. So, when the application is later closed the kernel panics because we are referencing KVA that is present in the unloaded modules. ----------8<------- Example ----------8<---------- $ sudo ip li add name mv0 link enp7s0 type macvtap $ sudo ip li show mv0 |grep mv0| awk -e '{print $1 $2}' 14:mv0@enp7s0: $ cat /dev/tap14 & $ lsmod |egrep -i 'tap|vlan' macvtap 16384 0 macvlan 24576 1 macvtap tap 24576 3 macvtap $ sudo modprobe -r macvtap $ fg cat /dev/tap14 ^C <...system panics...> BUG: unable to handle kernel paging request at ffffffffa038c500 IP: cdev_put+0xf/0x30 ----------8<-----------------8<---------- The fix is to set cdev.owner to the module that creates the tap device (either macvtap or ipvtap). With this set, the operations (in fs/char_dev.c) on char device holds and releases the module through cdev_get() and cdev_put() and will not allow the module to unload prematurely. Fixes: 9a393b5d5988ea4e (tap: tap as an independent module) Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-11/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Update the <linux/swait.h> documentation to discourage their use" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/swait: Document it clearly that the swait facilities are special and shouldn't be used
2017-10-27net/sched: Add support for HW offloading for CBSVinicius Costa Gomes1-0/+1
This adds support for offloading the CBS algorithm to the controller, if supported. Drivers wanting to support CBS offload must implement the .ndo_setup_tc callback and handle the TC_SETUP_CBS (introduced here) type. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Henrik Austad <henrik@austad.us> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-27net: dsa: lan9303: Move struct lan9303 to include/linux/dsa/lan9303.hEgil Hjelmeland1-0/+36
The next patch require net/dsa/tag_lan9303.c to access struct lan9303. Therefore move struct lan9303 definitions from drivers/net/dsa/lan9303.h to new file include/linux/dsa/lan9303.h. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27bpf: remove tail_call and get_stackid helper declarations from bpf.hGianluca Borello1-3/+0
commit afdb09c720b6 ("security: bpf: Add LSM hooks for bpf object related syscall") included linux/bpf.h in linux/security.h. As a result, bpf programs including bpf_helpers.h and some other header that ends up pulling in also security.h, such as several examples under samples/bpf, fail to compile because bpf_tail_call and bpf_get_stackid are now "redefined as different kind of symbol". >From bpf.h: u64 bpf_tail_call(u64 ctx, u64 r2, u64 index, u64 r4, u64 r5); u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); Whereas in bpf_helpers.h they are: static void (*bpf_tail_call)(void *ctx, void *map, int index); static int (*bpf_get_stackid)(void *ctx, void *map, int flags); Fix this by removing the unused declaration of bpf_tail_call and moving the declaration of bpf_get_stackid in bpf_trace.c, which is the only place where it's needed. Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27drivers/net: wan/sdla: Convert timers to use timer_setup()Kees Cook1-0/+1
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Cc: Allen Pais <allen.lkml@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Tobias Klauser <tklauser@distanz.ch> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26tcp: TCP experimental option for SMCUrsula Braun1-2/+7
The SMC protocol [1] relies on the use of a new TCP experimental option [2, 3]. With this option, SMC capabilities are exchanged between peers during the TCP three way handshake. This patch adds support for this experimental option to TCP. References: [1] SMC-R Informational RFC: http://www.rfc-editor.org/info/rfc7609 [2] Shared Use of TCP Experimental Options RFC 6994: https://tools.ietf.org/rfc/rfc6994.txt [3] IANA ExID SMCR: http://www.iana.org/assignments/tcp-parameters/tcp-parameters.xhtml#tcp-exids Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26net/mlx5e: DCBNL, Implement tc with ets type and zero bandwidthHuy Nguyen1-0/+2
Previously, tc with ets type and zero bandwidth is not accepted by driver. This behavior does not follow the IEEE802.1qaz spec. If there are tcs with ets type and zero bandwidth, these tcs are assigned to the lowest priority tc_group #0. We equally distribute 100% bw of the tc_group #0 to these zero bandwidth ets tcs. Also, the non zero bandwidth ets tcs are assigned to tc_group #1. If there is no zero bandwidth ets tc, the non zero bandwidth ets tcs are assigned to tc_group #0. Fixes: cdcf11212b22 ("net/mlx5e: Validate BW weight values of ETS") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26macvlan: remove unused fields in struct macvlan_devGirish Moodalbail2-15/+4
commit 635b8c8ecdd2 ("tap: Renaming tap related APIs, data structures, macros") captured all the tap related fields into a new struct tap_dev. However, it failed to remove those fields from struct macvlan_dev. Those fields are currently unused and must be removed. While there I moved the comment for MAX_TAP_QUEUES to the right place. Fixes: 635b8c8ecdd27142 (tap: Renaming tap related APIs, data structures, macros) Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-25bpf: permit multiple bpf attachments for a single perf eventYonghong Song2-9/+64
This patch enables multiple bpf attachments for a kprobe/uprobe/tracepoint single trace event. Each trace_event keeps a list of attached perf events. When an event happens, all attached bpf programs will be executed based on the order of attachment. A global bpf_event_mutex lock is introduced to protect prog_array attaching and detaching. An alternative will be introduce a mutex lock in every trace_event_call structure, but it takes a lot of extra memory. So a global bpf_event_mutex lock is a good compromise. The bpf prog detachment involves allocation of memory. If the allocation fails, a dummy do-nothing program will replace to-be-detached program in-place. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24PM / QoS: Fix device resume latency PM QoSRafael J. Wysocki1-2/+3
The special value of 0 for device resume latency PM QoS means "no restriction", but there are two problems with that. First, device resume latency PM QoS requests with 0 as the value are always put in front of requests with positive values in the priority lists used internally by the PM QoS framework, causing 0 to be chosen as an effective constraint value. However, that 0 is then interpreted as "no restriction" effectively overriding the other requests with specific restrictions which is incorrect. Second, the users of device resume latency PM QoS have no way to specify that *any* resume latency at all should be avoided, which is an artificial limitation in general. To address these issues, modify device resume latency PM QoS to use S32_MAX as the "no constraint" value and 0 as the "no latency at all" one and rework its users (the cpuidle menu governor, the genpd QoS governor and the runtime PM framework) to follow these changes. Also add a special "n/a" value to the corresponding user space I/F to allow user space to indicate that it cannot accept any resume latencies at all for the given device. Fixes: 85dc0b8a4019 (PM / QoS: Make it possible to expose PM QoS latency constraints) Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323 Reported-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Alex Shi <alex.shi@linaro.org> Cc: All applicable <stable@vger.kernel.org>
2017-10-24tcp: Configure TFO without cookie per socket and/or per routeChristoph Paasch1-1/+2
We already allow to enable TFO without a cookie by using the fastopen-sysctl and setting it to TFO_SERVER_COOKIE_NOT_REQD (or TFO_CLIENT_NO_COOKIE). This is safe to do in certain environments where we know that there isn't a malicous host (aka., data-centers) or when the application-protocol already provides an authentication mechanism in the first flight of data. A server however might be providing multiple services or talking to both sides (public Internet and data-center). So, this server would want to enable cookie-less TFO for certain services and/or for connections that go to the data-center. This patch exposes a socket-option and a per-route attribute to enable such fine-grained configurations. Signed-off-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-23sched/swait: Document it clearly that the swait facilities are special and ↵Davidlohr Bueso1-11/+16
shouldn't be used We currently welcome using swait over wait whenever possible because it is a slimmer data structure. However, Linus has made it very clear that he does not want this used, unless under very specific RT scenarios (such as current users). Update the comments before kernel hipsters start thinking swait is the cool thing to do. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dave@stgolabs.net Cc: wagi@monom.org Link: http://lkml.kernel.org/r/20171020171346.24445-1-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller17-55/+168
There were quite a few overlapping sets of changes here. Daniel's bug fix for off-by-ones in the new BPF branch instructions, along with the added allowances for "data_end > ptr + x" forms collided with the metadata additions. Along with those three changes came veritifer test cases, which in their final form I tried to group together properly. If I had just trimmed GIT's conflict tags as-is, this would have split up the meta tests unnecessarily. In the socketmap code, a set of preemption disabling changes overlapped with the rename of bpf_compute_data_end() to bpf_compute_data_pointers(). Changes were made to the mv88e6060.c driver set addr method which got removed in net-next. The hyperv transport socket layer had a locking change in 'net' which overlapped with a change of socket state macro usage in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of small fixes mostly in the irq drivers area: - Make the tango irq chip work correctly, which requires a new function in the generiq irq chip implementation - A set of updates to the GIC-V3 ITS driver removing a bogus BUG_ON() and parsing the VCPU table size correctly" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: generic chip: remove irq_gc_mask_disable_reg_and_ack() irqchip/tango: Use irq_gc_mask_disable_and_ack_set genirq: generic chip: Add irq_gc_mask_disable_and_ack_set() irqchip/gic-v3-its: Add missing changes to support 52bit physical address irqchip/gic-v3-its: Fix the incorrect parsing of VCPU table size irqchip/gic-v3-its: Fix the incorrect BUG_ON in its_init_vpe_domain() DT: arm,gic-v3: Update the ITS size in the examples
2017-10-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2-1/+4
Pull networking fixes from David Miller: "A little more than usual this time around. Been travelling, so that is part of it. Anyways, here are the highlights: 1) Deal with memcontrol races wrt. listener dismantle, from Eric Dumazet. 2) Handle page allocation failures properly in nfp driver, from Jaku Kicinski. 3) Fix memory leaks in macsec, from Sabrina Dubroca. 4) Fix crashes in pppol2tp_session_ioctl(), from Guillaume Nault. 5) Several fixes in bnxt_en driver, including preventing potential NVRAM parameter corruption from Michael Chan. 6) Fix for KRACK attacks in wireless, from Johannes Berg. 7) rtnetlink event generation fixes from Xin Long. 8) Deadlock in mlxsw driver, from Ido Schimmel. 9) Disallow arithmetic operations on context pointers in bpf, from Jakub Kicinski. 10) Missing sock_owned_by_user() check in sctp_icmp_redirect(), from Xin Long. 11) Only TCP is supported for sockmap, make that explicit with a check, from John Fastabend. 12) Fix IP options state races in DCCP and TCP, from Eric Dumazet. 13) Fix panic in packet_getsockopt(), also from Eric Dumazet. 14) Add missing locked in hv_sock layer, from Dexuan Cui. 15) Various aquantia bug fixes, including several statistics handling cures. From Igor Russkikh et al. 16) Fix arithmetic overflow in devmap code, from John Fastabend. 17) Fix busted socket memory accounting when we get a fault in the tcp zero copy paths. From Willem de Bruijn. 18) Don't leave opt->tot_len uninitialized in ipv6, from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits) stmmac: Don't access tx_q->dirty_tx before netif_tx_lock ipv6: flowlabel: do not leave opt->tot_len with garbage of_mdio: Fix broken PHY IRQ in case of probe deferral textsearch: fix typos in library helpers rxrpc: Don't release call mutex on error pointer net: stmmac: Prevent infinite loop in get_rx_timestamp_status() net: stmmac: Fix stmmac_get_rx_hwtstamp() net: stmmac: Add missing call to dev_kfree_skb() mlxsw: spectrum_router: Configure TIGCR on init mlxsw: reg: Add Tunneling IPinIP General Configuration Register net: ethtool: remove error check for legacy setting transceiver type soreuseport: fix initialization race net: bridge: fix returning of vlan range op errors sock: correct sk_wmem_queued accounting on efault in tcp zerocopy bpf: add test cases to bpf selftests to cover all access tests bpf: fix pattern matches for direct packet access bpf: fix off by one for range markings with L{T, E} patterns bpf: devmap fix arithmetic overflow in bitmap_size calculation net: aquantia: Bad udp rate on default interrupt coalescing net: aquantia: Enable coalescing management via ethtool interface ...
2017-10-22Merge branch 'for-linus' of ↵Linus Torvalds2-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - joydev now implements a blacklist to avoid creating joystick nodes for accelerometers found in composite devices such as PlaStation controllers - assorted driver fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ims-psu - check if CDC union descriptor is sane Input: joydev - blacklist ds3/ds4/udraw motion sensors Input: allow matching device IDs on property bits Input: factor out and export input_device_id matching code Input: goodix - poll the 'buffer status' bit before reading data Input: axp20x-pek - fix module not auto-loading for axp221 pek Input: tca8418 - enable interrupt after it has been requested Input: stmfts - fix setting ABS_MT_POSITION_* maximum size Input: ti_am335x_tsc - fix incorrect step config for 5 wire touchscreen Input: synaptics - disable kernel tracking on SMBus devices
2017-10-22drivers, connector: convert cn_callback_entry.refcnt from atomic_t to refcount_tElena Reshetova1-2/+2
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable cn_callback_entry.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22drivers, net, mlx5: convert mlx5_cq.refcount from atomic_t to refcount_tElena Reshetova1-2/+2
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mlx5_cq.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22drivers, net, mlx4: convert mlx4_srq.refcount from atomic_t to refcount_tElena Reshetova1-1/+1
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mlx4_srq.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22drivers, net, mlx4: convert mlx4_qp.refcount from atomic_t to refcount_tElena Reshetova1-1/+1
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mlx4_qp.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22drivers, net, mlx4: convert mlx4_cq.refcount from atomic_t to refcount_tElena Reshetova1-2/+2
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mlx4_cq.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21net: sched: add block bind/unbind notif. and extended block_get/putJiri Pirko1-0/+1
Introduce new type of ndo_setup_tc message to propage binding/unbinding of a block to driver. Call this ndo whenever qdisc gets/puts a block. Alongside with this, there's need to propagate binder type from qdisc code down to the notifier. So introduce extended variants of block_get/put in order to pass this info. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21Merge tag 'armsoc-fixes' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "Here is another set of bugfixes for ARM SoCs, mostly harmless: - a boot regression fix on ux500 - PCIe interrupts on NXP i.MX7 and on Marvell Armada 7K/8K were wired up wrong, in different ways - Armada XP support for large memory never worked - the socfpga reset controller now builds on 64-bit - minor device tree corrections on gemini, mvebu, r-pi 3, rockchip and at91" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: ux500: Fix regression while init PM domains ARM: dts: fix PCLK name on Gemini and MOXA ART arm64: dts: rockchip: fix typo in iommu nodes arm64: dts: rockchip: correct vqmmc voltage for rk3399 platforms ARM: dts: imx7d: Invert legacy PCI irq mapping bus: mbus: fix window size calculation for 4GB windows ARM: dts: at91: sama5d2: add ADC hw trigger edge type ARM: dts: at91: sama5d2_xplained: enable ADTRG pin ARM: dts: at91: at91-sama5d27_som1: fix PHY ID ARM: dts: bcm283x: Fix console path on RPi3 reset: socfpga: fix for 64-bit compilation ARM: dts: Fix I2C repeated start issue on Armada-38x arm64: dts: marvell: fix interrupt-map property for Armada CP110 PCIe controller arm64: dts: salvator-common: add 12V regulator to backlight ARM: dts: sun6i: Fix endpoint IDs in second display pipeline arm64: allwinner: a64: pine64: Use dcdc1 regulator for mmc0
2017-10-20selinux: bpf: Add addtional check for bpf object file receiveChenbo Feng1-0/+3
Introduce a bpf object related check when sending and receiving files through unix domain socket as well as binder. It checks if the receiving process have privilege to read/write the bpf map or use the bpf program. This check is necessary because the bpf maps and programs are using a anonymous inode as their shared inode so the normal way of checking the files and sockets when passing between processes cannot work properly on eBPF object. This check only works when the BPF_SYSCALL is configured. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20security: bpf: Add LSM hooks for bpf object related syscallChenbo Feng3-0/+105
Introduce several LSM hooks for the syscalls that will allow the userspace to access to eBPF object such as eBPF programs and eBPF maps. The security check is aimed to enforce a per object security protection for eBPF object so only processes with the right priviliges can read/write to a specific map or use a specific eBPF program. Besides that, a general security hook is added before the multiplexer of bpf syscall to check the cmd and the attribute used for the command. The actual security module can decide which command need to be checked and how the cmd should be checked. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20bpf: Add file mode configuration into bpf mapsChenbo Feng1-3/+5
Introduce the map read/write flags to the eBPF syscalls that returns the map fd. The flags is used to set up the file mode when construct a new file descriptor for bpf maps. To not break the backward capability, the f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise it should be O_RDONLY or O_WRONLY. When the userspace want to modify or read the map content, it will check the file mode to see if it is allowed to make the change. Signed-off-by: Chenbo Feng <fengc@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20net: Add extack to validator_info structs used for address notifierDavid Ahern1-0/+1
Add extack to in_validator_info and in6_validator_info. Update the one user of each, ipvlan, to return an error message for failures. Only manual configuration of an address is plumbed in the IPv6 code path. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20bpf: avoid preempt enable/disable in sockmap using tcp_skb_cb regionJohn Fastabend1-1/+1
SK_SKB BPF programs are run from the socket/tcp context but early in the stack before much of the TCP metadata is needed in tcp_skb_cb. So we can use some unused fields to place BPF metadata needed for SK_SKB programs when implementing the redirect function. This allows us to drop the preempt disable logic. It does however require an API change so sk_redirect_map() has been updated to additionally provide ctx_ptr to skb. Note, we do however continue to disable/enable preemption around actual BPF program running to account for map updates. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20Merge branch 'fixes-v4.14-rc5' of ↵Linus Torvalds1-17/+30
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull key handling fixes from James Morris: "This includes a fix for the capabilities code from Colin King, and a set of further fixes for the keys subsystem. From David: - Fix a bunch of places where kernel drivers may access revoked user-type keys and don't do it correctly. - Fix some ecryptfs bits. - Fix big_key to require CONFIG_CRYPTO. - Fix a couple of bugs in the asymmetric key type. - Fix a race between updating and finding negative keys. - Prevent add_key() from updating uninstantiated keys. - Make loading of key flags and expiry time atomic when not holding locks" * 'fixes-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: commoncap: move assignment of fs_ns to avoid null pointer dereference pkcs7: Prevent NULL pointer dereference, since sinfo is not always set. KEYS: load key flags and expiry time atomically in proc_keys_show() KEYS: Load key expiry time atomically in keyring_search_iterator() KEYS: load key flags and expiry time atomically in key_validate() KEYS: don't let add_key() update an uninstantiated key KEYS: Fix race between updating and finding a negative key KEYS: checking the input id parameters before finding asymmetric key KEYS: Fix the wrong index when checking the existence of second id security/keys: BIG_KEY requires CONFIG_CRYPTO ecryptfs: fix dereference of NULL user_key_payload fscrypt: fix dereference of NULL user_key_payload lib/digsig: fix dereference of NULL user_key_payload FS-Cache: fix dereference of NULL user_key_payload KEYS: encrypted: fix dereference of NULL user_key_payload
2017-10-20doc: Fix various RCU docbook comment-header problemsPaul E. McKenney3-9/+16
Because many of RCU's files have not been included into docbook, a number of errors have accumulated. This commit fixes them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-20membarrier: Provide register expedited private commandMathieu Desnoyers2-0/+19
This introduces a "register private expedited" membarrier command which allows eventual removal of important memory barrier constraints on the scheduler fast-paths. It changes how the "private expedited" membarrier command (new to 4.14) is used from user-space. This new command allows processes to register their intent to use the private expedited command. This affects how the expedited private command introduced in 4.14-rc is meant to be used, and should be merged before 4.14 final. Processes are now required to register before using MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM. This fixes a problem that arose when designing requested extensions to sys_membarrier() to allow JITs to efficiently flush old code from instruction caches. Several potential algorithms are much less painful if the user register intent to use this functionality early on, for example, before the process spawns the second thread. Registering at this time removes the need to interrupt each and every thread in that process at the first expedited sys_membarrier() system call. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-20Input: allow matching device IDs on property bitsDmitry Torokhov2-0/+7
Let's allow matching input devices on their property bits, both in-kernel and when generating module aliases. Tested-by: Roderick Colenbrander <roderick.colenbrander@sony.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-10-20Input: factor out and export input_device_id matching codeDmitry Torokhov1-0/+3
Factor out and export input_match_device_id() so that modules may use it. It will be needed by joydev to blacklist accelerometers in composite devices. Tested-by: Roderick Colenbrander <roderick.colenbrander@sony.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-10-19Merge tag 'mvebu-fixes-4.14-2' of git://git.infradead.org/linux-mvebu into fixesArnd Bergmann1-2/+2
Pull "mvebu fixes for 4.14 (part 2)" from Gregory CLEMENT Two device tree related fixes: - One on Armada 38x using a other compatible string for I2C in order to cover an errata. - One for Armada 7K/8K fixing a typo on interrupt-map property for PCIe leading to fail PME and AER root port service initialization And the last one for the mbus fixing the window size calculation when it exceed 32bits * tag 'mvebu-fixes-4.14-2' of git://git.infradead.org/linux-mvebu: bus: mbus: fix window size calculation for 4GB windows ARM: dts: Fix I2C repeated start issue on Armada-38x arm64: dts: marvell: fix interrupt-map property for Armada CP110 PCIe controller
2017-10-19dql: make dql_init return voidStephen Hemminger1-1/+1
dql_init always returned 0, and the only place that uses it in network core code didn't care about the return value anyway. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19Merge branch '40GbE' of ↵David S. Miller1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-17 This series contains updates to i40e and ethtool. Alan provides most of the changes in this series which are mainly fixes and cleanups. Renamed the ethtool "cmd" variable to "ks", since the new ethtool API passes us ksettings structs instead of command structs. Cleaned up an ifdef that was not accomplishing anything. Added function header comments to provide better documentation. Fixed two issues in i40e_get_link_ksettings(), by calling ethtool_link_ksettings_zero_link_mode() to ensure the advertising and link masks are cleared before we start setting bits. Cleaned up and fixed code comments which were incorrect. Separated the setting of autoneg in i40e_phy_types_to_ethtool() into its own conditional to clarify what PHYs support and advertise autoneg, and makes it easier to add new PHY types in the future. Added ethtool functionality to intersect two link masks together to find the common ground between them. Overhauled i40e to ensure that the new ethtool API macros are being used, instead of the old ones. Fixed the usage of unsigned 64-bit division which is not supported on all architectures. Sudheer adds support for 25G Active Optical Cables (AOC) and Active Copper Cables (ACC) PHY types. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>