summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2018-08-11net: Add ID (if needed) to sock_reuseport and expose reuseport_lockMartin KaFai Lau1-0/+6
A later patch will introduce a BPF_MAP_TYPE_REUSEPORT_ARRAY which allows a SO_REUSEPORT sk to be added to a bpf map. When a sk is removed from reuse->socks[], it also needs to be removed from the bpf map. Also, when adding a sk to a bpf map, the bpf map needs to ensure it is indeed in a reuse->socks[]. Hence, reuseport_lock is needed by the bpf map to ensure its map_update_elem() and map_delete_elem() operations are in-sync with the reuse->socks[]. The BPF_MAP_TYPE_REUSEPORT_ARRAY map will only acquire the reuseport_lock after ensuring the adding sk is already in a reuseport group (i.e. reuse->socks[]). The map_lookup_elem() will be lockless. This patch also adds an ID to sock_reuseport. A later patch will introduce BPF_PROG_TYPE_SK_REUSEPORT which allows a bpf prog to select a sk from a bpf map. It is inflexible to statically enforce a bpf map can only contain the sk belonging to a particular reuse->socks[] (i.e. same IP:PORT) during the bpf verification time. For example, think about the the map-in-map situation where the inner map can be dynamically changed in runtime and the outer map may have inner maps belonging to different reuseport groups. Hence, when the bpf prog (in the new BPF_PROG_TYPE_SK_REUSEPORT type) selects a sk, this selected sk has to be checked to ensure it belongs to the requesting reuseport group (i.e. the group serving that IP:PORT). The "sk->sk_reuseport_cb" pointer cannot be used for this checking purpose because the pointer value will change after reuseport_grow(). Instead of saving all checking conditions like the ones preced calling "reuseport_add_sock()" and compare them everytime a bpf_prog is run, a 32bits ID is introduced to survive the reuseport_grow(). The ID is only acquired if any of the reuse->socks[] is added to the newly introduced "BPF_MAP_TYPE_REUSEPORT_ARRAY" map. If "BPF_MAP_TYPE_REUSEPORT_ARRAY" is not used, the changes in this patch is a no-op. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-11tcp: Avoid TCP syncookie rejected by SO_REUSEPORT socketMartin KaFai Lau2-2/+32
Although the actual cookie check "__cookie_v[46]_check()" does not involve sk specific info, it checks whether the sk has recent synq overflow event in "tcp_synq_no_recent_overflow()". The tcp_sk(sk)->rx_opt.ts_recent_stamp is updated every second when it has sent out a syncookie (through "tcp_synq_overflow()"). The above per sk "recent synq overflow event timestamp" works well for non SO_REUSEPORT use case. However, it may cause random connection request reject/discard when SO_REUSEPORT is used with syncookie because it fails the "tcp_synq_no_recent_overflow()" test. When SO_REUSEPORT is used, it usually has multiple listening socks serving TCP connection requests destinated to the same local IP:PORT. There are cases that the TCP-ACK-COOKIE may not be received by the same sk that sent out the syncookie. For example, if reuse->socks[] began with {sk0, sk1}, 1) sk1 sent out syncookies and tcp_sk(sk1)->rx_opt.ts_recent_stamp was updated. 2) the reuse->socks[] became {sk1, sk2} later. e.g. sk0 was first closed and then sk2 was added. Here, sk2 does not have ts_recent_stamp set. There are other ordering that will trigger the similar situation below but the idea is the same. 3) When the TCP-ACK-COOKIE comes back, sk2 was selected. "tcp_synq_no_recent_overflow(sk2)" returns true. In this case, all syncookies sent by sk1 will be handled (and rejected) by sk2 while sk1 is still alive. The userspace may create and remove listening SO_REUSEPORT sockets as it sees fit. E.g. Adding new thread (and SO_REUSEPORT sock) to handle incoming requests, old process stopping and new process starting...etc. With or without SO_ATTACH_REUSEPORT_[CB]BPF, the sockets leaving and joining a reuseport group makes picking the same sk to check the syncookie very difficult (if not impossible). The later patches will allow bpf prog more flexibility in deciding where a sk should be located in a bpf map and selecting a particular SO_REUSEPORT sock as it sees fit. e.g. Without closing any sock, replace the whole bpf reuseport_array in one map_update() by using map-in-map. Getting the syncookie check working smoothly across socks in the same "reuse->socks[]" is important. A partial solution is to set the newly added sk's ts_recent_stamp to the max ts_recent_stamp of a reuseport group but that will require to iterate through reuse->socks[] OR pessimistically set it to "now - TCP_SYNCOOKIE_VALID" when a sk is joining a reuseport group. However, neither of them will solve the existing sk getting moved around the reuse->socks[] and that sk may not have ts_recent_stamp updated, unlikely under continuous synflood but not impossible. This patch opts to treat the reuseport group as a whole when considering the last synq overflow timestamp since they are serving the same IP:PORT from the userspace (and BPF program) perspective. "synq_overflow_ts" is added to "struct sock_reuseport". The tcp_synq_overflow() and tcp_synq_no_recent_overflow() will update/check reuse->synq_overflow_ts if the sk is in a reuseport group. Similar to the reuseport decision in __inet_lookup_listener(), both sk->sk_reuseport and sk->sk_reuseport_cb are tested for SO_REUSEPORT usage. Update on "synq_overflow_ts" happens at roughly once every second. A synflood test was done with a 16 rx-queues and 16 reuseport sockets. No meaningful performance change is observed. Before and after the change is ~9Mpps in IPv4. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-11Merge branch 'for-upstream' of ↵David S. Miller1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2018-08-10 Here's one more (most likely last) bluetooth-next pull request for the 4.19 kernel. - Added support for MediaTek serial Bluetooth devices - Initial skeleton for controller-side address resolution support - Fix BT_HCIUART_RTL related Kconfig dependencies - A few other minor fixes/cleanups Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-11gpio: mmio: Fix up inverted direction registersLinus Walleij1-0/+3
The bgpio_init() takes one of two arguments to specify a register to set the direction of the GPIO line: either dirout that indicates that a 1 in the bit in that register sets the corresponding line to output, or dirin which indicates that a 1 in the bit in that register sets the corresponding line to input. Conversely setting the bit to 0 on these will turn the line into input and output respectively. One of these can be defined but not both. This means that a platform that sets a bit to 1 for output only defines dirout and a platform that sets a bit to 0 for output only defines dirin. In short this defines the polarity of the direction register. Both can also be left as NULL meaning the GPIO chip is either input only or output only. Tomer Maimon discovered that for get/set chips (those where the get and set registers are defined but no separate clear register, and specifying BGPIOF_READ_OUTPUT_REG_SET so that we say we want to read the output value from the SET register) we are unconditionally reading the value from the SET register when the direction bit is 1 and from the DAT register when the direction bit is 0, not taking the direction bit polarity into account. It would be expected that when the direction bit is inverted (dirin is defined but not dirout) we read the current value from the DAT register when the bit is 1 and from the SET register when the bit is 0. Currently only some versions of ATH79, brcmstb, some versions of CLP711x, GE, IOP and Loongson use the dirin mode (a 1 in the register means input). They are unaffected because BGPIOF_READ_OUTPUT_REG_SET is not set on any of them. (They do not read back the SET register to figure out the output value.) So this is no regression with current drivers. However the behaviour is wrong and does not work with Tomer's new driver where he needs to use the BGIOF_READ_OUTPUT_REG_SET. This fixes the above issue by: - Instead of defining separate functions for the inverted case, set up a flag in the gpio_chip that indicates that the direction is inverted. - Remove the special inverted functions for setting input/output and getting the direction, rely on the flag instead. - Respect this flag in bgpio_get_set() and bgpio_get_set_multiple() Reported-by: Tomer Maimon <tmaimon77@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-08-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller6-23/+40
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains netfilter updates for your net-next tree: 1) Expose NFT_OSF_MAXGENRELEN maximum OS name length from the new OS passive fingerprint matching extension, from Fernando Fernandez. 2) Add extension to support for fine grain conntrack timeout policies from nf_tables. As preparation works, this patchset moves nf_ct_untimeout() to nf_conntrack_timeout and it also decouples the timeout policy from the ctnl_timeout object, most work done by Harsha Sharma. 3) Enable connection tracking when conntrack helper is in place. 4) Missing enumeration in uapi header when splitting original xt_osf to nfnetlink_osf, also from Fernando. 5) Fix a sparse warning due to incorrect typing in the nf_osf_find(), from Wei Yongjun. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-10net: Provide stub for __netif_set_xps_queue if there is no CONFIG_XPSKrzysztof Kozlowski1-0/+7
Building virtio_net driver without CONFIG_XPS fails with: drivers/net/virtio_net.c: In function ‘virtnet_set_affinity’: drivers/net/virtio_net.c:1910:3: error: implicit declaration of function ‘__netif_set_xps_queue’ [-Werror=implicit-function-declaration] __netif_set_xps_queue(vi->dev, mask, i, false); ^ Fixes: 4d99f6602cb5 ("net: allow to call netif_reset_xps_queues() under cpus_read_lock") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-10Merge branch 'spi-4.19' into spi-nextMark Brown3-257/+16
2018-08-10Merge branch 'regulator-4.19' into regulator-nextMark Brown10-6/+356
2018-08-10regulator: dt-bindings: add QCOM RPMh regulator bindingsDavid Collins1-0/+36
Introduce bindings for RPMh regulator devices found on some Qualcomm Technlogies, Inc. SoCs. These devices allow a given processor within the SoC to make PMIC regulator requests which are aggregated within the RPMh hardware block along with requests from other processors in the SoC to determine the final PMIC regulator hardware state. Signed-off-by: David Collins <collinsd@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-10Merge tag 'qcom-drivers-for-4.19' of ↵Mark Brown5-0/+305
git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into regulator-4.19 for RPMH Qualcomm ARM Based Driver Updates for v4.19 * Add Qualcomm LLCC driver * Add Qualcomm RPMH controller * Fix memleak in Qualcomm RMTFS * Add dummy qcom_scm_assign_mem() * Fix check for global partition in SMEM
2018-08-10Bluetooth: Add definitions for LE set address resolutionAnkit Navik1-0/+3
Add the definitions for LE address resolution enable HCI commands. When the LE address resolution enable gets changed via HCI commands make sure that flag gets updated. Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-08-10xdp: Helpers for disabling napi_direct of xdp_return_frameToshiaki Makita1-0/+25
We need some mechanism to disable napi_direct on calling xdp_return_frame_rx_napi() from some context. When veth gets support of XDP_REDIRECT, it will redirects packets which are redirected from other devices. On redirection veth will reuse xdp_mem_info of the redirection source device to make return_frame work. But in this case .ndo_xdp_xmit() called from veth redirection uses xdp_mem_info which is not guarded by NAPI, because the .ndo_xdp_xmit() is not called directly from the rxq which owns the xdp_mem_info. This approach introduces a flag in bpf_redirect_info to indicate that napi_direct should be disabled even when _rx_napi variant is used as well as helper functions to use it. A NAPI handler who wants to use this flag needs to call xdp_set_return_frame_no_direct() before processing packets, and call xdp_clear_return_frame_no_direct() after xdp_do_flush_map() before exiting NAPI. v4: - Use bpf_redirect_info for storing the flag instead of xdp_mem_info to avoid per-frame copy cost. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-10bpf: Make redirect_info accessible from modulesToshiaki Makita1-0/+10
We are going to add kern_flags field in redirect_info for kernel internal use. In order to avoid function call to access the flags, make redirect_info accessible from modules. Also as it is now non-static, add prefix bpf_ to redirect_info. v6: - Fix sparse warning around EXPORT_SYMBOL. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-10xdp: Helper function to clear kernel pointers in xdp_frameToshiaki Makita1-0/+7
xdp_frame has kernel pointers which should not be readable from bpf programs. When we want to reuse xdp_frame region but it may be read by bpf programs later, we can use this helper to clear kernel pointers. This is more efficient than calling memset() for the entire struct. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-10net: Export skb_headers_offset_updateToshiaki Makita1-0/+1
This is needed for veth XDP which does skb_copy_expand()-like operation. v2: - Drop skb_copy_header part because it has already been exported now. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-10PCI: Hide ACS quirk declarations inside PCI coreBjorn Helgaas1-11/+0
Move declarations for these functions: pci_dev_specific_acs_enabled() pci_dev_specific_enable_acs() from include/linux/pci.h to drivers/pci/pci.h because nothing outside the PCI core needs to use them. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-08-10net: skbuff.h: fix using plain integer as NULL warningYueHaibing1-1/+1
Fixes the following sparse warning: ./include/linux/skbuff.h:2365:58: warning: Using plain integer as NULL pointer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-10qed/qede: Multi CoS support.Manish Chopra1-0/+6
This patch adds support for tc mqprio offload, using this different traffic classes on the adapter can be utilized based on configured priority to tc map. For example - tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 This will cause SKBs with priority 0,1,2,3 to transmit over tc 0,1,2,3 hardware queues respectively. Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09PCI: Export pcie_has_flr()Alex Williamson1-0/+1
pcie_flr() suggests pcie_has_flr() to ensure that PCIe FLR support is present prior to calling. pcie_flr() is exported while pcie_has_flr() is not. Resolve this. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-08-09Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-2/+7
Overlapping changes in RXRPC, changing to ktime_get_seconds() whilst adding some tracepoints. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09net: stmmac: Add XGMAC 2.10 HWIF entryJose Abreu1-0/+1
Add a new entry to HWIF table for XGMAC 2.10. For now we fill it with empty callbacks which will be added in posterior patches. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09phylink: add helper for configuring 2500BaseX modesRussell King1-0/+1
Add a helper for MAC drivers to use in their validate callback to deal with 2500BaseX vs 1000BaseX modes, where the hardware supports both but it is not possible to automatically select between them. This helper defaults to 1000BaseX, as that is the 802.3 standard, and will allow users to select 2500BaseX either by forcing the speed if AN is disabled, or by changing the advertising mask if AN is enabled. Disabling AN is not recommended as it is only the speed that we're interested in controlling, not the duplex or pause mode parameters. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09ssb: Remove SSB_WARN_ON, SSB_BUG_ON and SSB_DEBUGMichael Büsch1-2/+0
Use the standard WARN_ON instead. If a small kernel is desired, WARN_ON can be disabled globally. Also remove SSB_DEBUG. Besides WARN_ON it only adds a tiny debug check. Include this check unconditionally. Signed-off-by: Michael Buesch <m@bues.ch> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-08-09blkcg: Introduce blkg_root_lookup()Bart Van Assche1-0/+18
This new function will be used in a later patch to verify whether a queue has been dissociated from the cgroup controller before being released. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Alexandru Moise <00moses.alexander00@gmail.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09block: Remove two superfluous #include directivesBart Van Assche1-2/+0
Commit 12f5b9314545 ("blk-mq: Remove generation seqeunce") removed the only seqcount_t and u64_stats_sync instances from <linux/blkdev.h> but did not remove the corresponding #include directives. Since these include directives are no longer needed, remove them. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Keith Busch <keith.busch@intel.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Cc: Hannes Reinecke <hare@suse.com>, Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-09Merge branch 'asoc-4.19' into asoc-nextMark Brown33-390/+549
2018-08-09Merge branch 'regmap-4.19' into regmap-nextMark Brown2-3/+62
2018-08-09Merge tag 'regmap-noinc-read' into regmap-4.19Mark Brown1-0/+19
regmap: Support non-incrementing registers Some devices have individual registers that don't autoincrement the register address during bulk reads but instead repeatedly read the same value, for example for monitoring GPIOs or ADCs. Add support for these.
2018-08-09regmap: Add regmap_noinc_read APICrestez Dan Leonard1-0/+19
The regmap API usually assumes that bulk read operations will read a range of registers but some I2C/SPI devices have certain registers for which a such a read operation will return data from an internal FIFO instead. Add an explicit API to support bulk read without range semantics. Some linux drivers use regmap_bulk_read or regmap_raw_read for such registers, for example mpu6050 or bmi150 from IIO. This only happens to work because when caching is disabled a single regmap read op will map to a single bus read op (as desired). This breaks if caching is enabled and reg+1 happens to be a cacheable register. Without regmap support refactoring a driver to enable regmap caching requires separate I2C and SPI paths. This is exactly what regmap is supposed to help avoid. Suggested-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Crestez Dan Leonard <leonard.crestez@intel.com> Signed-off-by: Stefan Popa <stefan.popa@analog.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-09arm64 / ACPI: clean the additional checks before calling ghes_notify_sea()Dongjiu Geng1-0/+4
In order to remove the additional check before calling the ghes_notify_sea(), make stub definition when !CONFIG_ACPI_APEI_SEA. After this cleanup, we can simply call the ghes_notify_sea() to let APEI driver handle the SEA notification. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-09net/mlx5: Remove unused mlx5_query_vport_admin_stateEli Cohen1-2/+0
mlx5_query_vport_admin_state() is not used anywhere. Remove it. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09net/mlx5: Rename modify/query_vport state related enumsEran Ben Elisha2-5/+5
Modify and query vport state commands share the same admin_state and op_mod values, rename the enums to fit them both. In addition, remove the esw prefix from the admin state enum as this also applied for vnic. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09net/mlx5: Use max_num_eqs for calculation of required MSIX vectorsDenis Drozdov1-1/+4
New firmware has defined new HCA capability field called "max_num_eqs", that is the number of available EQs after subtracting reserved FW EQs. Before this capability the FW reported the EQ number in "log_max_eqs", the reported value also contained FW reserved EQs, but the driver might be failing to load on 320 cpus systems due to the fact that FW reserved EQs were not available to the driver. Now the driver has to obtain max_num_eqs value from new FW to get real number of EQs available. Signed-off-by: Denis Drozdov <denisd@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08netfilter: nfnetlink_osf: add missing enum in nfnetlink_osf uapi headerFernando Fernandez Mancera3-12/+13
xt_osf_window_size_options was originally part of include/uapi/linux/netfilter/xt_osf.h, restore it. Fixes: bfb15f2a95cb ("netfilter: extract Passive OS fingerprint infrastructure from xt_osf") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-08nvme.h: add support for ns write protect definitionsChaitanya Kulkarni1-2/+15
Add various definitions from NVMe 1.3 TP 4005. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-08-08nvme.h: fixup ANA group descriptor formatHannes Reinecke1-1/+1
ANA Phase 3 draft had the 'reserved' field in the group descriptor format set to '23:17' (so that the first namespace identifier started at byte 24), but that got move with the approved TP to '31:17' (so that the first namespace identifier started at byte 32). Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-08-08regulator: maxim: Add SPDX license identifiersKrzysztof Kozlowski1-4/+1
Replace GPL v2.0 and v2.0+ license statements with SPDX license identifiers. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-08-08llc: use refcount_inc_not_zero() for llc_sap_find()Cong Wang1-0/+5
llc_sap_put() decreases the refcnt before deleting sap from the global list. Therefore, there is a chance llc_sap_find() could find a sap with zero refcnt in this global list. Close this race condition by checking if refcnt is zero or not in llc_sap_find(), if it is zero then it is being removed so we can just treat it as gone. Reported-by: <syzbot+278893f3f7803871f7ce@syzkaller.appspotmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08net: phy: Add support for Broadcom Omega internal Combo GPHYArun Parameswaran1-0/+1
Add support for the Broadcom Omega SoC internal Combo Ethernet GPHY to the bcm7xxx phy driver. Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07Merge branch 'drm-next-4.19' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie2-0/+152
into drm-next Fixes for 4.19: - Fix UVD 7.2 instance handling - Fix UVD 7.2 harvesting - GPU scheduler fix for when a process is killed - TTM cleanups - amdgpu CS bo_list fixes - Powerplay fixes for polaris12 and CZ/ST - DC fixes for link training certain HMDs - DC fix for vega10 blank screen in certain cases From: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180801222906.1016-1-alexander.deucher@amd.com
2018-08-07vsock: split dwork to avoid reinitializationsCong Wang1-2/+2
syzbot reported that we reinitialize an active delayed work in vsock_stream_connect(): ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x90 kernel/workqueue.c:1414 WARNING: CPU: 1 PID: 11518 at lib/debugobjects.c:329 debug_print_object+0x16a/0x210 lib/debugobjects.c:326 The pattern is apparently wrong, we should only initialize the dealyed work once and could repeatly schedule it. So we have to move out the initializations to allocation side. And to avoid confusion, we can split the shared dwork into two, instead of re-using the same one. Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Reported-by: <syzbot+8a9b1bd330476a4f3db6@syzkaller.appspotmail.com> Cc: Andy king <acking@vmware.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07net/sched: allow flower to match tunnel optionsPieter Jansen van Vuuren1-0/+26
Allow matching on options in Geneve tunnel headers. This makes use of existing tunnel metadata support. The options can be described in the form CLASS:TYPE:DATA/CLASS_MASK:TYPE_MASK:DATA_MASK, where CLASS is represented as a 16bit hexadecimal value, TYPE as an 8bit hexadecimal value and DATA as a variable length hexadecimal value. e.g. # ip link add name geneve0 type geneve dstport 0 external # tc qdisc add dev geneve0 ingress # tc filter add dev geneve0 protocol ip parent ffff: \ flower \ enc_src_ip 10.0.99.192 \ enc_dst_ip 10.0.99.193 \ enc_key_id 11 \ geneve_opts 0102:80:1122334421314151/ffff:ff:ffffffffffffffff \ ip_proto udp \ action mirred egress redirect dev eth1 This patch adds support for matching Geneve options in the order supplied by the user. This leads to an efficient implementation in the software datapath (and in our opinion hardware datapaths that offload this feature). It is also compatible with Geneve options matching provided by the Open vSwitch kernel datapath which is relevant here as the Flower classifier may be used as a mechanism to program flows into hardware as a form of Open vSwitch datapath offload (sometimes referred to as OVS-TC). The netlink Kernel/Userspace API may be extended, for example by adding a flag, if other matching options are desired, for example matching given options in any order. This would require an implementation in the TC software datapath. And be done in a way that drivers that facilitate offload of the Flower classifier can reject or accept such flows based on hardware datapath capabilities. This approach was discussed and agreed on at Netconf 2017 in Seoul. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07flow_dissector: allow dissection of tunnel options from metadataSimon Horman1-0/+17
Allow the existing 'dissection' of tunnel metadata to 'dissect' options already present in tunnel metadata. This dissection is controlled by a new dissector key, FLOW_DISSECTOR_KEY_ENC_OPTS. This dissection only occurs when skb_flow_dissect_tunnel_info() is called, currently only the Flower classifier makes that call. So there should be no impact on other users of the flow dissector. This is in preparation for allowing the flower classifier to match on Geneve options. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07ethtool: Add WAKE_FILTER and RX_CLS_FLOW_WAKEFlorian Fainelli1-1/+4
Add the ability to specify through ethtool::rxnfc that a rule location is special and will be used to participate in Wake-on-LAN, by e.g: having a specific pattern be matched. When this is the case, fs->ring_cookie must be set to the special value RX_CLS_FLOW_WAKE. We also define an additional ethtool::wolinfo flag: WAKE_FILTER which can be used to configure an Ethernet adapter to allow Wake-on-LAN using previously programmed filters. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller5-8/+119
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-08-07 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add cgroup local storage for BPF programs, which provides a fast accessible memory for storing various per-cgroup data like number of transmitted packets, etc, from Roman. 2) Support bpf_get_socket_cookie() BPF helper in several more program types that have a full socket available, from Andrey. 3) Significantly improve the performance of perf events which are reported from BPF offload. Also convert a couple of BPF AF_XDP samples overto use libbpf, both from Jakub. 4) seg6local LWT provides the End.DT6 action, which allows to decapsulate an outer IPv6 header containing a Segment Routing Header. Adds this action now to the seg6local BPF interface, from Mathieu. 5) Do not mark dst register as unbounded in MOV64 instruction when both src and dst register are the same, from Arthur. 6) Define u_smp_rmb() and u_smp_wmb() to their respective barrier instructions on arm64 for the AF_XDP sample code, from Brian. 7) Convert the tcp_client.py and tcp_server.py BPF selftest scripts over from Python 2 to Python 3, from Jeremy. 8) Enable BTF build flags to the BPF sample code Makefile, from Taeung. 9) Remove an unnecessary rcu_read_lock() in run_lwt_bpf(), from Taehee. 10) Several improvements to the README.rst from the BPF documentation to make it more consistent with RST format, from Tobin. 11) Replace all occurrences of strerror() by calls to strerror_r() in libbpf and fix a FORTIFY_SOURCE build error along with it, from Thomas. 12) Fix a bug in bpftool's get_btf() function to correctly propagate an error via PTR_ERR(), from Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07netfilter: nft_ct: add ct timeout supportHarsha Sharma1-1/+13
This patch allows to add, list and delete connection tracking timeout policies via nft objref infrastructure and assigning these timeout via nft rule. %./libnftnl/examples/nft-ct-timeout-add ip raw cttime tcp Ruleset: table ip raw { ct timeout cttime { protocol tcp; policy = {established: 111, close: 13 } } chain output { type filter hook output priority -300; policy accept; ct timeout set "cttime" } } %./libnftnl/examples/nft-rule-ct-timeout-add ip raw output cttime %conntrack -E [NEW] tcp 6 111 ESTABLISHED src=172.16.19.128 dst=172.16.19.1 sport=22 dport=41360 [UNREPLIED] src=172.16.19.1 dst=172.16.19.128 sport=41360 dport=22 %nft delete rule ip raw output handle <handle> %./libnftnl/examples/nft-ct-timeout-del ip raw cttime Joint work with Pablo Neira. Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07netfilter: remove ifdef around cttimeout in struct nf_conntrack_l4protoPablo Neira Ayuso1-2/+0
Simplify this, include it inconditionally in this structure layout as we do with ctnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07netfilter: cttimeout: decouple timeout policy from nfnetlink_cttimeout objectPablo Neira Ayuso1-9/+13
The timeout policy is currently embedded into the nfnetlink_cttimeout object, move the policy into an independent object. This allows us to reuse part of the existing conntrack timeout extension from nf_tables without adding dependencies with the nfnetlink_cttimeout object layout. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07netfilter: cttimeout: move ctnl_untimeout to nf_conntrackHarsha Sharma1-0/+1
As, ctnl_untimeout is required by nft_ct, so move ctnl_timeout from nfnetlink_cttimeout to nf_conntrack_timeout and rename as nf_ct_timeout. Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-07netfilter: nft_osf: use NFT_OSF_MAXGENRELEN instead of IFNAMSIZFernando Fernandez Mancera1-0/+1
As no "genre" on pf.os exceed 16 bytes of length, we reduce NFT_OSF_MAXGENRELEN parameter to 16 bytes and use it instead of IFNAMSIZ. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>