summaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)AuthorFilesLines
2020-09-22netfilter: nf_tables: Remove ununsed function nft_data_debugYueHaibing1-7/+0
It is never used, so can be removed. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-09-22Merge tag 'mac80211-next-for-net-next-2020-09-21' of ↵David S. Miller2-9/+158
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== This time we have: * some AP-side infrastructure for FILS discovery and unsolicited probe resonses * a major rework of the encapsulation/header conversion offload from Felix, to fit better with e.g. AP_VLAN interfaces * performance fix for VHT A-MPDU size, don't limit to HT * some initial patches for S1G (sub 1 GHz) support, more will come in this area * minor cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-21net: sctp: Fix IPv6 ancestor_size calc in sctp_copy_descendantHenry Ptasinski1-3/+5
When calculating ancestor_size with IPv6 enabled, simply using sizeof(struct ipv6_pinfo) doesn't account for extra bytes needed for alignment in the struct sctp6_sock. On x86, there aren't any extra bytes, but on ARM the ipv6_pinfo structure is aligned on an 8-byte boundary so there were 4 pad bytes that were omitted from the ancestor_size calculation. This would lead to corruption of the pd_lobby pointers, causing an oops when trying to free the sctp structure on socket close. Fixes: 636d25d557d1 ("sctp: not copy sctp_sock pd_lobby in sctp_copy_descendant") Signed-off-by: Henry Ptasinski <hptasinski@google.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-19net: dsa: wire up devlink info getAndrew Lunn1-1/+4
Allow the DSA drivers to implement the devlink call to get info info, e.g. driver name, firmware version, ASIC ID, etc. v2: Combine declaration and the assignment on a single line. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-19net: dsa: Add devlink regions support to DSAAndrew Lunn1-0/+6
Allow DSA drivers to make use of devlink regions, via simple wrappers. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-19net: dsa: Add helper to convert from devlink to dsAndrew Lunn1-0/+7
Given a devlink instance, return the dsa switch it is associated to. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-19net: devlink: region: Pass the region ops to the snapshot functionAndrew Lunn1-1/+3
Pass the region to be snapshotted to the function performing the snapshot. This allows one function to operate on numerous regions. v4: Add missing kerneldoc for ICE Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-19net: devlink: regions: Add a priv member to the regions ops structAndrew Lunn1-0/+2
The driver may have multiple regions which can be dumped using one function. However, for this to work, additional information is needed. Add a priv member to the ops structure for the driver to use however it likes. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18devlink: collect flash notify params into a structShannon Nelson1-0/+19
The dev flash status notify function parameter lists are getting rather long, so add a struct to be filled and passed rather than continuously changing the function signatures. Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18devlink: add timeout information to status_notifyShannon Nelson1-0/+4
Add a timeout element to the DEVLINK_CMD_FLASH_UPDATE_STATUS netlink message for use by a userland utility to show that a particular firmware flash activity may take a long but bounded time to finish. Also add a handy helper for drivers to make use of the new timeout value. UI usage hints: - if non-zero, add timeout display to the end of the status line [component] status_msg ( Xm Ys : Am Bs ) using the timeout value for Am Bs and updating the Xm Ys every second - if the timeout expires while awaiting the next update, display something like [component] status_msg ( timeout reached : Am Bs ) - if new status notify messages are received, remove the timeout and start over Signed-off-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18mac80211: fix some encapsulation offload kernel-docJohannes Berg1-1/+3
Add a missing kernel-doc entry for the offload_flags, and correct the name of the update_vif_offload method. Link: https://lore.kernel.org/r/20200918132115.d46a0915ba8a.Ibba536d04a5a5fb655f8ef6e51b247457bfda4ca@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18cfg80211: add missing kernel-doc for S1G band capabilitiesJohannes Berg1-0/+1
Add missing kernel-doc for the S1G band capabilities in the per band data. Link: https://lore.kernel.org/r/20200918131921.08c893cd73a1.Id71583c37baca8a9a3329426e02b66d9ab65ac03@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18mac80211: Unsolicited broadcast probe response supportAloka Dixit1-0/+20
This patch adds mac80211 support to configure unsolicited broadcast probe response transmission for in-band discovery in 6GHz. Changes include functions to store and retrieve probe response template, and packet interval (0 - 20 TUs). Setting interval to 0 disables the unsolicited broadcast probe response transmission. Signed-off-by: Aloka Dixit <alokad@codeaurora.org> Link: https://lore.kernel.org/r/010101747a946b35-ad25858a-1f1f-48df-909e-dc7bf26d9169-000000@us-west-2.amazonses.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18nl80211: Unsolicited broadcast probe response supportAloka Dixit1-0/+18
This patch adds new attributes to support unsolicited broadcast probe response transmission used for in-band discovery in 6GHz band (IEEE P802.11ax/D6.0 26.17.2.3.2, AP behavior for fast passive scanning). The new attribute, NL80211_ATTR_UNSOL_BCAST_PROBE_RESP, is nested which supports following parameters: (1) NL80211_UNSOL_BCAST_PROBE_RESP_ATTR_INT - Packet interval (2) NL80211_UNSOL_BCAST_PROBE_RESP_ATTR_TMPL - Template data Signed-off-by: Aloka Dixit <alokad@codeaurora.org> Link: https://lore.kernel.org/r/010101747a946698-aac263ae-2ed3-4dab-9590-0bc7131214e1-000000@us-west-2.amazonses.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18mac80211: Add FILS discovery supportAloka Dixit1-0/+27
This patch adds mac80211 support to configure FILS discovery transmission. Changes include functions to store and retrieve FILS discovery template, minimum and maximum packet intervals. Signed-off-by: Aloka Dixit <alokad@codeaurora.org> Link: https://lore.kernel.org/r/20200805011838.28166-3-alokad@codeaurora.org [remove SUPPORTS_FILS_DISCOVERY, driver can just set wiphy info] Link: https://lore.kernel.org/r/010101747a7b3cbb-6edaa89c-436d-4391-8765-61456d7f5f4e-000000@us-west-2.amazonses.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18nl80211: Add FILS discovery supportAloka Dixit1-0/+19
FILS discovery attribute, NL80211_ATTR_FILS_DISCOVERY, is nested which supports following parameters as given in IEEE Std 802.11ai-2016, Annex C.3 MIB detail: (1) NL80211_FILS_DISCOVERY_ATTR_INT_MIN - Minimum packet interval (2) NL80211_FILS_DISCOVERY_ATTR_INT_MAX - Maximum packet interval (3) NL80211_FILS_DISCOVERY_ATTR_TMPL - Template data Signed-off-by: Aloka Dixit <alokad@codeaurora.org> Link: https://lore.kernel.org/r/20200805011838.28166-2-alokad@codeaurora.org [fix attribute and other names, use NLA_RANGE(), use policy only once] Link: https://lore.kernel.org/r/010101747a7b38a8-306f06b2-9061-4baf-81c1-054a42a18e22-000000@us-west-2.amazonses.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18nl80211: support setting S1G channelsThomas Pedersen1-0/+10
S1G channels have a single width defined per frequency, so derive it from the channel flags with ieee80211_s1g_channel_width(). Also support setting an S1G channel where control frequency may differ from operating, and add some basic validation to ensure the control channel is with the operating. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200908190323.15814-6-thomas@adapt-ip.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18nl80211: advertise supported channel width in S1GThomas Pedersen1-0/+15
S1G supports 5 channel widths: 1, 2, 4, 8, and 16. One channel width is allowed per frequency in each operating class, so it makes more sense to advertise the specific channel width allowed. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200908190323.15814-3-thomas@adapt-ip.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18mac80211: extend ieee80211_tx_status_ext to support bulk freeFelix Fietkau1-0/+2
Store processed skbs ready to be freed in a list so the driver bulk free them Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200908123702.88454-13-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18mac80211: notify the driver when a sta uses 4-address modeFelix Fietkau1-0/+4
This is needed for encapsulation offload of 4-address mode packets Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200908123702.88454-14-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18mac80211: swap NEED_TXPROCESSING and HW_80211_ENCAP tx flagsFelix Fietkau1-7/+7
In order to unify the tx status path, the hw 802.11 encapsulation flag needs to survive the trip to the tx status call. Since we don't have any free bits in info->flags, we need to move one. IEEE80211_TX_INTFL_NEED_TXPROCESSING is only used internally in mac80211, and only before the call into the driver. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200908123702.88454-10-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18mac80211: remove tx status call to ieee80211_sta_register_airtimeFelix Fietkau1-2/+3
All drivers using airtime fairness are calling ieee80211_sta_register_airtime directly, now they must. Document this as well. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200908123702.88454-8-nbd@nbd.name [johannes: update the documentation to suit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18mac80211: rework tx encapsulation offload APIFelix Fietkau1-0/+29
The current API (which lets the driver turn on/off per vif directly) has a number of limitations: - it does not deal with AP_VLAN - conditions for enabling (no tkip, no monitor) are only checked at add_interface time - no way to indicate 4-addr support In order to address this, store offload flags in struct ieee80211_vif (easy to extend for decap offload later). mac80211 initially sets the enable flag, but gives the driver a chance to modify it before its settings are applied. In addition to the .add_interface op, a .update_vif_offload op is introduced, which can be used for runtime changes. If a driver can't disable encap offload at runtime, or if it has some extra limitations, it can simply override the flags within those ops. Support for encap offload with 4-address mode interfaces can be enabled by setting a flag from .add_interface or .update_vif_offload. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200908123702.88454-6-nbd@nbd.name [resolved conflict with commit aa2092a9bab3 ("ath11k: add raw mode and software crypto support")] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18cfg80211: add more comments for ap_isolate in bss_parametersWright Feng1-0/+1
The value of struct bss_parameters::ap_isolate will be -1, 0 or 1. The value -1 means not to change. To prevent developers from thinking ap_isolate is only 0 or 1, I add more comments on it. Signed-off-by: Wright Feng <wright.feng@cypress.com> Reviewed-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200908060157.98846-1-wright.feng@cypress.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-18genetlink: Remove unused function genl_err_attr()YueHaibing1-8/+0
It is never used, so can remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-18net/sched: Remove unused function qdisc_queue_drop_head()YueHaibing1-6/+0
It is not used since commit a09ceb0e0814 ("sched: remove qdisc->drop") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-16nexthop: Convert to blocking notification chainIdo Schimmel1-1/+1
Currently, the only listener of the nexthop notification chain is the VXLAN driver. Subsequent patches will add more listeners (e.g., device drivers such as netdevsim) that need to be able to block when processing notifications. Therefore, convert the notification chain to a blocking one. This is safe as notifications are always emitted from process context. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-16nexthop: Remove NEXTHOP_EVENT_ADDIdo Schimmel1-1/+0
Not used anywhere. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-16nexthop: Remove unused function declaration from header fileIdo Schimmel1-3/+0
Not used or implemented anywhere. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-16devlink: introduce the health reporter test commandJiri Pirko1-0/+3
Introduce a test command for health reporters. User might use this command to trigger test event on a reporter if the reporter supports it. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15bridge: Add SWITCHDEV_FDB_FLUSH_TO_BRIDGE notifierAlexandra Winter1-0/+1
so the switchdev can notifiy the bridge to flush non-permanent fdb entries for this port. This is useful whenever the hardware fdb of the switchdev is reset, but the netdev and the bridgeport are not deleted. Note that this has the same effect as the IFLA_BRPORT_FLUSH attribute. CC: Jiri Pirko <jiri@resnulli.us> CC: Ivan Vecera <ivecera@redhat.com> CC: Roopa Prabhu <roopa@nvidia.com> CC: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15net: sched: only keep the available bits when setting vxlan md->gbpXin Long1-0/+3
As we can see from vxlan_build/parse_gbp_hdr(), when processing metadata on vxlan rx/tx path, only dont_learn/policy_applied/policy_id fields can be set to or parse from the packet for vxlan gbp option. So we'd better do the mask when set it in act_tunnel_key and cls_flower. Otherwise, when users don't know these bits, they may configure with a value which can never be matched. Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-15ipv4: Initialize flowi4_multipath_hash in data pathDavid Ahern1-0/+1
flowi4_multipath_hash was added by the commit referenced below for tunnels. Unfortunately, the patch did not initialize the new field for several fast path lookups that do not initialize the entire flow struct to 0. Fix those locations. Currently, flowi4_multipath_hash is random garbage and affects the hash value computed by fib_multipath_hash for multipath selection. Fixes: 24ba14406c5c ("route: Add multipath_hash in flowi_common to make user-define hash") Signed-off-by: David Ahern <dsahern@gmail.com> Cc: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-14tcp: remove SOCK_QUEUE_SHRUNKEric Dumazet1-2/+0
SOCK_QUEUE_SHRUNK is currently used by TCP as a temporary state that remembers if some room has been made in the rtx queue by an incoming ACK packet. This is later used from tcp_check_space() before considering to send EPOLLOUT. Problem is: If we receive SACK packets, and no packet is removed from RTX queue, we can send fresh packets, thus moving them from write queue to rtx queue and eventually empty the write queue. This stall can happen if TCP_NOTSENT_LOWAT is used. With this fix, we no longer risk stalling sends while holes are repaired, and we can fully use socket sndbuf. This also removes a cache line dirtying for typical RPC workloads. Fixes: c9bee3b7fdec ("tcp: TCP_NOTSENT_LOWAT socket option") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-14mptcp: call tcp_cleanup_rbuf on subflowsPaolo Abeni1-0/+2
That is needed to let the subflows announce promptly when new space is available in the receive buffer. tcp_cleanup_rbuf() is currently a static function, drop the scope modifier and add a declaration in the TCP header. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-13Bluetooth: Emit controller suspend and resume eventsAbhishek Pandit-Subedi2-0/+7
Emit controller suspend and resume events when we are ready for suspend and we've resumed from suspend. The controller suspend event will report whatever suspend state was successfully entered. The controller resume event will check the first HCI event that was received after we finished preparing for suspend and, if it was a connection event, store the address of the peer that caused the event. If it was not a connection event, we mark the wake reason as an unexpected event. Here is a sample btmon trace with these events: @ MGMT Event: Controller Suspended (0x002d) plen 1 Suspend state: Page scanning and/or passive scanning (2) @ MGMT Event: Controller Resumed (0x002e) plen 8 Wake reason: Remote wake due to peer device connection (2) LE Address: CD:F3:CD:13:C5:9A (OUI CD-F3-CD) Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-13Bluetooth: Add suspend reason for device disconnectAbhishek Pandit-Subedi1-0/+1
Update device disconnect event with reason 0x5 to indicate that device disconnected because the controller is suspending. Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Reviewed-by: Sonny Sasaka <sonnysasaka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-13Bluetooth: Add mgmt suspend and resume eventsAbhishek Pandit-Subedi2-0/+14
Add the controller suspend and resume events, which will signal when Bluetooth has completed preparing for suspend and when it's ready for resume. Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Reviewed-by: Sonny Sasaka <sonnysasaka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-11Bluetooth: Add MGMT capability flags for tx power and ext advertisingDaniel Winkler1-0/+2
For new advertising features, it will be important for userspace to know the capabilities of the controller and kernel. If the controller and kernel support extended advertising, we include flags indicating hardware offloading support and support for setting tx power of adv instances. In the future, vendor-specific commands may allow the setting of tx power in advertising instances, but for now this feature is only marked available if extended advertising is supported. This change is manually verified in userspace by ensuring the advertising manager's supported_flags field is updated with new flags on hatch chromebook (ext advertising supported). Signed-off-by: Daniel Winkler <danielwinkler@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-11tcp: simplify tcp_set_congestion_control(): Always reinitializeNeal Cardwell1-1/+1
Now that the previous patches ensure that all call sites for tcp_set_congestion_control() want to initialize congestion control, we can simplify tcp_set_congestion_control() by removing the reinit argument and the code to support it. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Kevin Yang <yyd@google.com> Cc: Lawrence Brakmo <brakmo@fb.com>
2020-09-11tcp: Only init congestion control if not initialized alreadyNeal Cardwell1-1/+2
Change tcp_init_transfer() to only initialize congestion control if it has not been initialized already. With this new approach, we can arrange things so that if the EBPF code sets the congestion control by calling setsockopt(TCP_CONGESTION) then tcp_init_transfer() will not re-initialize the CC module. This is an approach that has the following beneficial properties: (1) This allows CC module customizations made by the EBPF called in tcp_init_transfer() to persist, and not be wiped out by a later call to tcp_init_congestion_control() in tcp_init_transfer(). (2) Does not flip the order of EBPF and CC init, to avoid causing bugs for existing code upstream that depends on the current order. (3) Does not cause 2 initializations for for CC in the case where the EBPF called in tcp_init_transfer() wants to set the CC to a new CC algorithm. (4) Allows follow-on simplifications to the code in net/core/filter.c and net/ipv4/tcp_cong.c, which currently both have some complexity to special-case CC initialization to avoid double CC initialization if EBPF sets the CC. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Kevin Yang <yyd@google.com> Cc: Lawrence Brakmo <brakmo@fb.com>
2020-09-11netlink: fix doc about nlmsg_parse/nla_validateNicolas Dichtel1-2/+0
There is no @validate argument. CC: Johannes Berg <johannes.berg@intel.com> Fixes: 3de644035446 ("netlink: re-add parse/validate functions in strict mode") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10tcp: reflect tos value received in SYN to the socketWei Wang1-0/+1
This commit adds a new TCP feature to reflect the tos value received in SYN, and send it out on the SYN-ACK, and eventually set the tos value of the established socket with this reflected tos value. This provides a way to set the traffic class/QoS level for all traffic in the same connection to be the same as the incoming SYN request. It could be useful in data centers to provide equivalent QoS according to the incoming request. This feature is guarded by /proc/sys/net/ipv4/tcp_reflect_tos, and is by default turned off. Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10ip: pass tos into ip_build_and_send_pkt()Wei Wang1-1/+1
This commit adds tos as a new passed in parameter to ip_build_and_send_pkt() which will be used in the later commit. This is a pure restructure and does not have any functional change. Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10devlink: Introduce controller numberParav Pandit1-2/+7
A devlink port may be for a controller consist of PCI device. A devlink instance holds ports of two types of controllers. (1) controller discovered on same system where eswitch resides This is the case where PCI PF/VF of a controller and devlink eswitch instance both are located on a single system. (2) controller located on external host system. This is the case where a controller is located in one system and its devlink eswitch ports are located in a different system. When a devlink eswitch instance serves the devlink ports of both controllers together, PCI PF/VF numbers may overlap. Due to this a unique phys_port_name cannot be constructed. For example in below such system controller-0 and controller-1, each has PCI PF pf0 whose eswitch ports can be present in controller-0. These results in phys_port_name as "pf0" for both. Similar problem exists for VFs and upcoming Sub functions. An example view of two controller systems: --------------------------------------------------------- | | | --------- --------- ------- ------- | ----------- | | vf(s) | | sf(s) | |vf(s)| |sf(s)| | | server | | ------- ----/---- ---/----- ------- ---/--- ---/--- | | pci rc |=== | pf0 |______/________/ | pf1 |___/_______/ | | connect | | ------- ------- | ----------- | | controller_num=1 (no eswitch) | ------|-------------------------------------------------- (internal wire) | --------------------------------------------------------- | devlink eswitch ports and reps | | ----------------------------------------------------- | | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | | | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | | | ----------------------------------------------------- | | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | | | |pf1 | pf1vfN | pf1sfN | pf1 | pf1vfN |pf0sfN | | | ----------------------------------------------------- | | | | | | --------- --------- ------- ------- | | | vf(s) | | sf(s) | |vf(s)| |sf(s)| | | ------- ----/---- ---/----- ------- ---/--- ---/--- | | | pf0 |______/________/ | pf1 |___/_______/ | | ------- ------- | | | | local controller_num=0 (eswitch) | --------------------------------------------------------- An example devlink port for external controller with controller number = 1 for a VF 1 of PF 0: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false function: hw_addr 00:00:00:00:00:00 $ devlink port show pci/0000:06:00.0/2 -jp { "port": { "pci/0000:06:00.0/2": { "type": "eth", "netdev": "ens2f0pf0vf1", "flavour": "pcivf", "controller": 1, "pfnum": 0, "vfnum": 1, "external": true, "splittable": false, "function": { "hw_addr": "00:00:00:00:00:00" } } } } Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10devlink: Introduce external controller flagParav Pandit1-2/+6
A devlink eswitch port may represent PCI PF/VF ports of a controller. A controller either located on same system or it can be an external controller located in host where such NIC is plugged in. Add the ability for driver to specify if a port is for external controller. Use such flag in the mlx5_core driver. An example of an external controller having VF1 of PF0 belong to controller 1. $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf pfnum 0 vfnum 1 external true splittable false function: hw_addr 00:00:00:00:00:00 $ devlink port show pci/0000:06:00.0/2 -jp { "port": { "pci/0000:06:00.0/2": { "type": "eth", "netdev": "ens2f0pf0vf1", "flavour": "pcivf", "pfnum": 0, "vfnum": 1, "external": true, "splittable": false, "function": { "hw_addr": "00:00:00:00:00:00" } } } } Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10devlink: Move structure comments outside of structureParav Pandit1-3/+12
To add more fields to the PCI PF and VF port attributes, follow standard structure comment format. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10devlink: Add comment block for missing port attributesParav Pandit1-0/+3
Add comment block for physical, PF and VF port attributes. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2-3/+4
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Rewrite inner header IPv6 in ICMPv6 messages in ip6t_NPT, from Michael Zhou. 2) do_ip_vs_set_ctl() dereferences uninitialized value, from Peilin Ye. 3) Support for userdata in tables, from Jose M. Guisado. 4) Do not increment ct error and invalid stats at the same time, from Florian Westphal. 5) Remove ct ignore stats, also from Florian. 6) Add ct stats for clash resolution, from Florian Westphal. 7) Bump reference counter bump on ct clash resolution only, this is safe because bucket lock is held, again from Florian. 8) Use ip_is_fragment() in xt_HMARK, from YueHaibing. 9) Add wildcard support for nft_socket, from Balazs Scheidler. 10) Remove superfluous IPVS dependency on iptables, from Yaroslav Bolyukin. 11) Remove unused definition in ebt_stp, from Wang Hai. 12) Replace CONFIG_NFT_CHAIN_NAT_{IPV4,IPV6} by CONFIG_NFT_NAT in selftests/net, from Fabian Frederick. 13) Add userdata support for nft_object, from Jose M. Guisado. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-08netfilter: nf_tables: add userdata support for nft_objectJose M. Guisado Gomez1-0/+2
Enables storing userdata for nft_object. Initially this will store an optional comment but can be extended in the future as needed. Adds new attribute NFTA_OBJ_USERDATA to nft_object. Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>