summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2022-12-03tcp: use 2-arg optimal variant of kfree_rcu()Eric Dumazet1-2/+2
kfree_rcu(1-arg) should be avoided as much as possible, since this is only possible from sleepable contexts, and incurr extra rcu barriers. I wish the 1-arg variant of kfree_rcu() would get a distinct name, like kfree_rcu_slow() to avoid it being abused. Fixes: 459837b522f7 ("net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Dmitry Safonov <dima@arista.com> Link: https://lore.kernel.org/r/20221202052847.2623997-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-03Merge tag 'wireless-next-2022-12-02' of ↵Jakub Kicinski8-167/+230
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.2 Third set of patches for v6.2. mt76 has a new driver for mt7996 Wi-Fi 7 devices and iwlwifi also got initial Wi-Fi 7 support. Otherwise smaller features and fixes. Major changes: ath10k - store WLAN firmware version in SMEM image table mt76 - mt7996: new driver for MediaTek Wi-Fi 7 (802.11be) devices - mt7986, mt7915: enable Wireless Ethernet Dispatch (WED) offload support - mt7915: add ack signal support - mt7915: enable coredump support - mt7921: remain_on_channel support - mt7921: channel context support iwlwifi - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities - 320 MHz channels support * tag 'wireless-next-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (144 commits) wifi: ath10k: fix QCOM_SMEM dependency wifi: mt76: mt7921e: add pci .shutdown() support wifi: mt76: mt7915: mmio: fix naming convention wifi: mt76: mt7996: add support to configure spatial reuse parameter set wifi: mt76: mt7996: enable ack signal support wifi: mt76: mt7996: enable use_cts_prot support wifi: mt76: mt7915: rely on band_idx of mt76_phy wifi: mt76: mt7915: enable per bandwidth power limit support wifi: mt76: mt7915: introduce mt7915_get_power_bound() mt76: mt7915: Fix PCI device refcount leak in mt7915_pci_init_hif2() wifi: mt76: do not send firmware FW_FEATURE_NON_DL region wifi: mt76: mt7921: Add missing __packed annotation of struct mt7921_clc wifi: mt76: fix coverity overrun-call in mt76_get_txpower() wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices wifi: mt76: mt76x0: remove dead code in mt76x0_phy_get_target_power wifi: mt76: mt7915: fix band_idx usage wifi: mt76: mt7915: enable .sta_set_txpwr support wifi: mt76: mt7915: add basedband Txpower info into debugfs wifi: mt76: mt7915: add support to configure spatial reuse parameter set wifi: mt76: mt7915: add missing MODULE_PARM_DESC ... ==================== Link: https://lore.kernel.org/r/20221202214254.D0D3DC433C1@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02net: devlink: convert port_list into xarrayJiri Pirko1-30/+27
Some devlink instances may contain thousands of ports. Storing them in linked list and looking them up is not scalable. Convert the linked list into xarray. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-02hsr: Use a single struct for self_node.Sebastian Andrzej Siewior3-37/+35
self_node_db is a list_head with one entry of struct hsr_node. The purpose is to hold the two MAC addresses of the node itself. It is convenient to recycle the structure. However having a list_head and fetching always the first entry is not really optimal. Created a new data strucure contaning the two MAC addresses named hsr_self_node. Access that structure like an RCU protected pointer so it can be replaced on the fly without blocking the reader. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02hsr: Synchronize sequence number updates.Sebastian Andrzej Siewior2-1/+10
hsr_register_frame_out() compares new sequence_nr vs the old one recorded in hsr_node::seq_out and if the new sequence_nr is higher then it will be written to hsr_node::seq_out as the new value. This operation isn't locked so it is possible that two frames with the same sequence number arrive (via the two slave devices) and are fed to hsr_register_frame_out() at the same time. Both will pass the check and update the sequence counter later to the same value. As a result the content of the same packet is fed into the stack twice. This was noticed by running ping and observing DUP being reported from time to time. Instead of using the hsr_priv::seqnr_lock for the whole receive path (as it is for sending in the master node) add an additional lock that is only used for sequence number checks and updates. Add a per-node lock that is used during sequence number reads and updates. Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02hsr: Synchronize sending frames to have always incremented outgoing seq nr.Sebastian Andrzej Siewior2-7/+8
Sending frames via the hsr (master) device requires a sequence number which is tracked in hsr_priv::sequence_nr and protected by hsr_priv::seqnr_lock. Each time a new frame is sent, it will obtain a new id and then send it via the slave devices. Each time a packet is sent (via hsr_forward_do()) the sequence number is checked via hsr_register_frame_out() to ensure that a frame is not handled twice. This make sense for the receiving side to ensure that the frame is not injected into the stack twice after it has been received from both slave ports. There is no locking to cover the sending path which means the following scenario is possible: CPU0 CPU1 hsr_dev_xmit(skb1) hsr_dev_xmit(skb2) fill_frame_info() fill_frame_info() hsr_fill_frame_info() hsr_fill_frame_info() handle_std_frame() handle_std_frame() skb1's sequence_nr = 1 skb2's sequence_nr = 2 hsr_forward_do() hsr_forward_do() hsr_register_frame_out(, 2) // okay, send) hsr_register_frame_out(, 1) // stop, lower seq duplicate Both skbs (or their struct hsr_frame_info) received an unique id. However since skb2 was sent before skb1, the higher sequence number was recorded in hsr_register_frame_out() and the late arriving skb1 was dropped and never sent. This scenario has been observed in a three node HSR setup, with node1 + node2 having ping and iperf running in parallel. From time to time ping reported a missing packet. Based on tracing that missing ping packet did not leave the system. It might be possible (didn't check) to drop the sequence number check on the sending side. But if the higher sequence number leaves on wire before the lower does and the destination receives them in that order and it will drop the packet with the lower sequence number and never inject into the stack. Therefore it seems the only way is to lock the whole path from obtaining the sequence number and sending via dev_queue_xmit() and assuming the packets leave on wire in the same order (and don't get reordered by the NIC). Cover the whole path for the master interface from obtaining the ID until after it has been forwarded via hsr_forward_skb() to ensure the skbs are sent to the NIC in the order of the assigned sequence numbers. Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02hsr: Disable netpoll.Sebastian Andrzej Siewior2-11/+8
The hsr device is a software device. Its net_device_ops::ndo_start_xmit() routine will process the packet and then pass the resulting skb to dev_queue_xmit(). During processing, hsr acquires a lock with spin_lock_bh() (hsr_add_node()) which needs to be promoted to the _irq() suffix in order to avoid a potential deadlock. Then there are the warnings in dev_queue_xmit() (due to local_bh_disable() with disabled interrupts) left. Instead trying to address those (there is qdisc and…) for netpoll sake, just disable netpoll on hsr. Disable netpoll on hsr and replace the _irqsave() locking with _bh(). Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02hsr: Avoid double remove of a node.Sebastian Andrzej Siewior2-5/+12
Due to the hashed-MAC optimisation one problem become visible: hsr_handle_sup_frame() walks over the list of available nodes and merges two node entries into one if based on the information in the supervision both MAC addresses belong to one node. The list-walk happens on a RCU protected list and delete operation happens under a lock. If the supervision arrives on both slave interfaces at the same time then this delete operation can occur simultaneously on two CPUs. The result is the first-CPU deletes the from the list and the second CPUs BUGs while attempting to dereference a poisoned list-entry. This happens more likely with the optimisation because a new node for the mac_B entry is created once a packet has been received and removed (merged) once the supervision frame has been received. Avoid removing/ cleaning up a hsr_node twice by adding a `removed' field which is set to true after the removal and checked before the removal. Fixes: f266a683a4804 ("net/hsr: Better frame dispatch") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02hsr: Add a rcu-read lock to hsr_forward_skb().Sebastian Andrzej Siewior1-0/+3
hsr_forward_skb() a skb and keeps information in an on-stack hsr_frame_info. hsr_get_node() assigns hsr_frame_info::node_src which is from a RCU list. This pointer is used later in hsr_forward_do(). I don't see a reason why this pointer can't vanish midway since there is no guarantee that hsr_forward_skb() is invoked from an RCU read section. Use rcu_read_lock() to protect hsr_frame_info::node_src from its assignment until it is no longer used. Fixes: f266a683a4804 ("net/hsr: Better frame dispatch") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02Revert "net: hsr: use hlist_head instead of list_head for mac addresses"Sebastian Andrzej Siewior7-187/+106
The hlist optimisation (which not only uses hlist_head instead of list_head but also splits hsr_priv::node_db into an array of 256 slots) does not consider the "node merge": Upon starting the hsr network (with three nodes) a packet that is sent from node1 to node3 will also be sent from node1 to node2 and then forwarded to node3. As a result node3 will receive 2 packets because it is not able to filter out the duplicate. Each packet received will create a new struct hsr_node with macaddress_A only set the MAC address it received from (the two MAC addesses from node1). At some point (early in the process) two supervision frames will be received from node1. They will be processed by hsr_handle_sup_frame() and one frame will leave early ("Node has already been merged") and does nothing. The other frame will be merged as portB and have its MAC address written to macaddress_B and the hsr_node (that was created for it as macaddress_A) will be removed. From now on HSR is able to identify a duplicate because both packets sent from one node will result in the same struct hsr_node because hsr_get_node() will find the MAC address either on macaddress_A or macaddress_B. Things get tricky with the optimisation: If sender's MAC address is saved as macaddress_A then the lookup will work as usual. If the MAC address has been merged into macaddress_B of another hsr_node then the lookup won't work because it is likely that the data structure is in another bucket. This results in creating a new struct hsr_node and not recognising a possible duplicate. A way around it would be to add another hsr_node::mac_list_B and attach it to the other bucket to ensure that this hsr_node will be looked up either via macaddress_A _or_ macaddress_B. I however prefer to revert it because it sounds like an academic problem rather than real life workload plus it adds complexity. I'm not an HSR expert with what is usual size of a network but I would guess 40 to 60 nodes. With 10.000 nodes and assuming 60us for pass-through (from node to node) then it would take almost 600ms for a packet to almost wrap around which sounds a lot. Revert the hash MAC addresses optimisation. Fixes: 4acc45db71158 ("net: hsr: use hlist_head instead of list_head for mac addresses") Cc: Juhee Kang <claudiajkang@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02sctp: delete free member from struct sctp_sched_opsXin Long3-51/+20
After commit 9ed7bfc79542 ("sctp: fix memory leak in sctp_stream_outq_migrate()"), sctp_sched_set_sched() is the only place calling sched->free(), and it can actually be replaced by sched->free_sid() on each stream, and yet there's already a loop to traverse all streams in sctp_sched_set_sched(). This patch adds a function sctp_sched_free_sched() where it calls sched->free_sid() for each stream to replace sched->free() calls in sctp_sched_set_sched() and then deletes the unused free member from struct sctp_sched_ops. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/e10aac150aca2686cb0bd0570299ec716da5a5c0.1669849471.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02mptcp: add pm listener eventsGeliang Tang3-0/+62
This patch adds two new MPTCP netlink event types for PM listening socket create and close, named MPTCP_EVENT_LISTENER_CREATED and MPTCP_EVENT_LISTENER_CLOSED. Add a new function mptcp_event_pm_listener() to push the new events with family, port and addr to userspace. Invoke mptcp_event_pm_listener() with MPTCP_EVENT_LISTENER_CREATED in mptcp_listen() and mptcp_pm_nl_create_listen_socket(), invoke it with MPTCP_EVENT_LISTENER_CLOSED in __mptcp_close_ssk(). Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02net/tcp: Separate initialization of twskDmitry Safonov1-26/+35
Convert BUG_ON() to WARN_ON_ONCE() and warn as well for unlikely static key int overflow error-path. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02net/tcp: Do cleanup on tcp_md5_key_copy() failureDmitry Safonov2-14/+10
If the kernel was short on (atomic) memory and failed to allocate it - don't proceed to creation of request socket. Otherwise the socket would be unsigned and userspace likely doesn't expect that the TCP is not MD5-signed anymore. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destructionDmitry Safonov5-29/+77
To do that, separate two scenarios: - where it's the first MD5 key on the system, which means that enabling of the static key may need to sleep; - copying of an existing key from a listening socket to the request socket upon receiving a signed TCP segment, where static key was already enabled (when the key was added to the listening socket). Now the life-time of the static branch for TCP-MD5 is until: - last tcp_md5sig_info is destroyed - last socket in time-wait state with MD5 key is closed. Which means that after all sockets with TCP-MD5 keys are gone, the system gets back the performance of disabled md5-key static branch. While at here, provide static_key_fast_inc() helper that does ref counter increment in atomic fashion (without grabbing cpus_read_lock() on CONFIG_JUMP_LABEL=y). This is needed to add a new user for a static_key when the caller controls the lifetime of another user. Signed-off-by: Dmitry Safonov <dima@arista.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-02net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add()Dmitry Safonov1-9/+21
Add a helper to allocate tcp_md5sig_info, that will help later to do/allocate things when info allocated, once per socket. Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01wifi: mac80211: fix and simplify unencrypted drop check for meshFelix Fietkau1-28/+10
ieee80211_drop_unencrypted is called from ieee80211_rx_h_mesh_fwding and ieee80211_frame_allowed. Since ieee80211_rx_h_mesh_fwding can forward packets for other mesh nodes and is called earlier, it needs to check the decryptions status and if the packet is using the control protocol on its own, instead of deferring to the later call from ieee80211_frame_allowed. Because of that, ieee80211_drop_unencrypted has a mesh specific check that skips over the mesh header in order to check the payload protocol. This code is invalid when called from ieee80211_frame_allowed, since that happens after the 802.11->802.3 conversion. Fix this by moving the mesh specific check directly into ieee80211_rx_h_mesh_fwding. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20221201135730.19723-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: mac80211: add support for restricting netdev features per vifFelix Fietkau2-97/+187
This can be used to selectively disable feature flags for checksum offload, scatter/gather or GSO by changing vif->netdev_features. Removing features from vif->netdev_features does not affect the netdev features themselves, but instead fixes up skbs in the tx path so that the offloads are not needed in the driver. Aside from making it easier to deal with vif type based hardware limitations, this also makes it possible to optimize performance on hardware without native GSO support by declaring GSO support in hw->netdev_features and removing it from vif->netdev_features. This allows mac80211 to handle GSO segmentation after the sta lookup, but before itxq enqueue, thus reducing the number of unnecessary sta lookups, as well as some other per-packet processing. Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20221010094338.78070-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: mac80211: update TIM for S1G specification changesKieran Frewen1-5/+9
Updates to the TIM information element to match changes made in the IEEE Std 802.11ah-2020. Signed-off-by: Kieran Frewen <kieran.frewen@morsemicro.com> Co-developed-by: Gilad Itzkovitch <gilad.itzkovitch@morsemicro.com> Signed-off-by: Gilad Itzkovitch <gilad.itzkovitch@morsemicro.com> Link: https://lore.kernel.org/r/20221106221602.25714-1-gilad.itzkovitch@morsemicro.com [use skb_put_data/skb_put_u8] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: mac80211: don't parse multi-BSSID in assoc respJohannes Berg1-1/+1
It's not valid to have the multiple BSSID element in the association response (per 802.11 REVme D1.0), so don't try to parse it there, but only in the fallback beacon elements if needed. The other case that was parsing association requests was already changed in a previous commit. Change-Id: I659d2ef1253e079cc71c46a017044e116e31c024 Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: cfg80211: use bss_from_pub() instead of container_of()Johannes Berg1-30/+11
There's no need to open-code container_of() when we have bss_from_pub(). Use it. Change-Id: I074723717909ba211a40e6499f0c36df0e2ba4be Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: mac80211: remove unnecessary synchronize_net()Johannes Berg1-2/+1
The call to ieee80211_do_stop() right after will also do synchronize_rcu() to ensure the SDATA_STATE_RUNNING bit is cleared, so we don't need to synchronize_net() here. Change-Id: Id9f9ffcf195002013e5d9fde288877d219780864 Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: mac80211: Drop not needed check for NULLAlexander Wetzel1-1/+1
ieee80211_get_txq() can only be called with vif != NULL. Remove not needed NULL test in function. Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/20221107161328.2883-1-alexander@wetzel-home.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: cfg80211: Fix not unregister reg_pdev when load_builtin_regdb_keys() failsChen Zhongjin1-1/+3
In regulatory_init_db(), when it's going to return a error, reg_pdev should be unregistered. When load_builtin_regdb_keys() fails it doesn't do it and makes cfg80211 can't be reload with report: sysfs: cannot create duplicate filename '/devices/platform/regulatory.0' ... <TASK> dump_stack_lvl+0x79/0x9b sysfs_warn_dup.cold+0x1c/0x29 sysfs_create_dir_ns+0x22d/0x290 kobject_add_internal+0x247/0x800 kobject_add+0x135/0x1b0 device_add+0x389/0x1be0 platform_device_add+0x28f/0x790 platform_device_register_full+0x376/0x4b0 regulatory_init+0x9a/0x4b2 [cfg80211] cfg80211_init+0x84/0x113 [cfg80211] ... Fixes: 90a53e4432b1 ("cfg80211: implement regdb signature checking") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221109090237.214127-1-chenzhongjin@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: cfg80211: fix comparison of BSS frequenciesJUN-KYU SHIN1-1/+2
If the "channel->freq_offset" comparison is omitted in cmp_bss(), BSS with different kHz units cannot be distinguished in the S1G Band. So "freq_offset" should also be included in the comparison. Signed-off-by: JUN-KYU SHIN <jk.shin@newratek.com> Link: https://lore.kernel.org/r/20221111023301.6395-1-jk.shin@newratek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: mac80211: fix maybe-unused warningÍñigo Huguet1-1/+1
In ieee80211_lookup_key, the variable named `local` is unused if compiled without lockdep, getting this warning: net/mac80211/cfg.c: In function ‘ieee80211_lookup_key’: net/mac80211/cfg.c:542:26: error: unused variable ‘local’ [-Werror=unused-variable] struct ieee80211_local *local = sdata->local; ^~~~~ Fix it with __maybe_unused. Fixes: 8cbf0c2ab6df ("wifi: mac80211: refactor some key code") Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Link: https://lore.kernel.org/r/20221111153622.29016-1-ihuguet@redhat.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: mac80211: fix memory leak in ieee80211_if_add()Zhengchao Shao1-0/+1
When register_netdevice() failed in ieee80211_if_add(), ndev->tstats isn't released. Fix it. Fixes: 5a490510ba5f ("mac80211: use per-CPU TX/RX statistics") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20221117064500.319983-1-shaozhengchao@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01wifi: nl80211: Add checks for nla_nest_start() in nl80211_send_iface()Yuan Can1-0/+3
As the nla_nest_start() may fail with NULL returned, the return value needs to be checked. Fixes: ce08cd344a00 ("wifi: nl80211: expose link information for interfaces") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221129014211.56558-1-yuancan@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-12-01net: devlink: make the devlink_ops::info_get() callback optionalVincent Mailhol1-7/+6
Some drivers only reported the driver name in their devlink_ops::info_get() callback. Now that the core provides this information, the callback became empty. For such drivers, just removing the callback would prevent the core from executing devlink_nl_info_fill() meaning that "devlink dev info" would not return anything. Make the callback function optional by executing devlink_nl_info_fill() even if devlink_ops::info_get() is NULL. N.B.: the drivers with devlink support which previously did not implement devlink_ops::info_get() will now also be able to report the driver name. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01net: devlink: let the core report the driver name instead of the driversVincent Mailhol1-8/+18
The driver name is available in device_driver::name. Right now, drivers still have to report this piece of information themselves in their devlink_ops::info_get callback function. In order to factorize code, make devlink_nl_info_fill() add the driver name attribute. Now that the core sets the driver name attribute, drivers are not supposed to call devlink_info_driver_name_put() anymore. Remove devlink_info_driver_name_put() and clean-up all the drivers using this function in their callback. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Tested-by: Ido Schimmel <idosch@nvidia.com> # mlxsw Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01devlink: support directly reading from region memoryJacob Keller1-17/+63
To read from a region, user space must currently request a new snapshot of the region and then read from that snapshot. This can sometimes be overkill if user space only reads a tiny portion. They first create the snapshot, then request a read, then destroy the snapshot. For regions which have a single underlying "contents", it makes sense to allow supporting direct reading of the region data. Extend the DEVLINK_CMD_REGION_READ to allow direct reading from a region if requested via the new DEVLINK_ATTR_REGION_DIRECT. If this attribute is set, then perform a direct read instead of using a snapshot. Direct read is mutually exclusive with DEVLINK_ATTR_REGION_SNAPSHOT_ID, and care is taken to ensure that we reject commands which provide incorrect attributes. Regions must enable support for direct read by implementing the .read() callback function. If a region does not support such direct reads, a suitable extended error message is reported. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01devlink: refactor region_read_snapshot_fill to use a callback functionJacob Keller1-9/+35
The devlink_nl_region_read_snapshot_fill is used to copy the contents of a snapshot into a message for reporting to userspace via the DEVLINK_CMG_REGION_READ netlink message. A future change is going to add support for directly reading from a region. Almost all of the logic for this new capability is identical. To help reduce code duplication and make this logic more generic, refactor the function to take a cb and cb_priv pointer for doing the actual copy. Add a devlink_region_snapshot_fill implementation that will simply copy the relevant chunk of the region. This does require allocating some storage for the chunk as opposed to simply passing the correct address forward to the devlink_nl_cmg_region_read_chunk_fill function. A future change to implement support for directly reading from a region without a snapshot will provide a separate implementation that calls the newly added devlink region operation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01devlink: remove unnecessary parameter from chunk_fill functionJacob Keller1-8/+2
The devlink parameter of the devlink_nl_cmd_region_read_chunk_fill function is not used. Remove it, to simplify the function signature. Once removed, it is also obvious that the devlink parameter is not necessary for the devlink_nl_region_read_snapshot_fill either. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01devlink: find snapshot in devlink_nl_cmd_region_read_dumpitJacob Keller1-11/+14
The snapshot pointer is obtained inside of the function devlink_nl_region_read_snapshot_fill. Simplify this function by locating the snapshot upfront in devlink_nl_cmd_region_read_dumpit instead. This aligns with how other netlink attributes are handled, and allows us to exit slightly earlier if an invalid snapshot ID is provided. It also allows us to pass the snapshot pointer directly to the devlink_nl_region_read_snapshot_fill, and remove the now unused attrs parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01devlink: report extended error message in region_read_dumpit()Jacob Keller1-4/+12
Report extended error details in the devlink_nl_cmd_region_read_dumpit() function, by using the extack structure from the netlink_callback. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01devlink: use min_t to calculate data_sizeJacob Keller1-4/+2
The calculation for the data_size in the devlink_nl_read_snapshot_fill function uses an if statement that is better expressed using the min_t macro. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30Merge branch 'master' of ↵Jakub Kicinski7-46/+109
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== ipsec-next 2022-11-26 1) Remove redundant variable in esp6. From Colin Ian King. 2) Update x->lastused for every packet. It was used only for outgoing mobile IPv6 packets, but showed to be usefull to check if the a SA is still in use in general. From Antony Antony. 3) Remove unused variable in xfrm_byidx_resize. From Leon Romanovsky. 4) Finalize extack support for xfrm. From Sabrina Dubroca. * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: add extack to xfrm_set_spdinfo xfrm: add extack to xfrm_alloc_userspi xfrm: add extack to xfrm_do_migrate xfrm: add extack to xfrm_new_ae and xfrm_replay_verify_len xfrm: add extack to xfrm_del_sa xfrm: add extack to xfrm_add_sa_expire xfrm: a few coding style clean ups xfrm: Remove not-used total variable xfrm: update x->lastused for every packet esp6: remove redundant variable err ==================== Link: https://lore.kernel.org/r/20221126110303.1859238-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30mptcp: add support for TCP_FASTOPEN_KEY sockoptMatthieu Baerts1-3/+3
The goal of this socket option is to set different keys per listener, see commit 1fba70e5b6be ("tcp: socket option to set TCP fast open key") for more details about this socket option. The only thing to do here with MPTCP is to relay the request to the first subflow like it is already done for the other TCP_FASTOPEN* socket options. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30mptcp: add TCP_FASTOPEN sock optionDmytro Shytyi1-1/+4
The TCP_FASTOPEN socket option is one way for the application to tell the kernel TFO support has to be enabled for the listener socket. The only thing to do here with MPTCP is to relay the request to the first subflow like it is already done for the other TCP_FASTOPEN* socket options. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30mptcp: add subflow_v(4,6)_send_synack()Dmytro Shytyi4-1/+92
The send_synack() needs to be overridden for MPTCP to support TFO for two reasons: - There is not be enough space in the TCP options if the TFO cookie has to be added in the SYN+ACK with other options: MSS (4), SACK OK (2), Timestamps (10), Window Scale (3+1), TFO (10+2), MP_CAPABLE (12). MPTCPv1 specs -- RFC 8684, section B.1 [1] -- suggest to drop the TCP timestamps option in this case. - The data received in the SYN has to be handled: the SKB can be dequeued from the subflow sk and transferred to the MPTCP sk. Counters need to be updated accordingly and the application can be notified at the end because some bytes have been received. [1] https://www.rfc-editor.org/rfc/rfc8684.html#section-b.1 Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30mptcp: implement delayed seq generation for passive fastopenDmytro Shytyi6-15/+54
With fastopen in place, the first subflow socket is created before the MPC handshake completes, and we need to properly initialize the sequence numbers at MPC ACK reception. Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30mptcp: consolidate initial ack seq generationPaolo Abeni4-45/+45
Currently the initial ack sequence is generated on demand whenever it's requested and the remote key is handy. The relevant code is scattered in different places and can lead to multiple, unneeded, crypto operations. This change consolidates the ack sequence generation code in a single helper, storing the sequence number at the subflow level. The above additionally saves a few conditional in fast-path and will simplify the upcoming fast-open implementation. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30mptcp: track accurately the incoming MPC suboption typePaolo Abeni1-3/+8
Currently in the receive path we don't need to discriminate between MPC SYN, MPC SYN-ACK and MPC ACK, but soon the fastopen code will need that info to properly track the fully established status. Track the exact MPC suboption type into the receive opt bitmap. No functional change intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30mptcp: add MSG_FASTOPEN sendmsg flag supportDmytro Shytyi1-6/+3
Since commit 54f1944ed6d2 ("mptcp: factor out mptcp_connect()"), all the infrastructure is now in place to support the MSG_FASTOPEN flag, we just need to call into the fastopen path in mptcp_sendmsg(). Co-developed-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski55-263/+400
tools/lib/bpf/ringbuf.c 927cbb478adf ("libbpf: Handle size overflow for ringbuf mmap") b486d19a0ab0 ("libbpf: checkpatch: Fixed code alignments in ringbuf.c") https://lore.kernel.org/all/20221121122707.44d1446a@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29Merge tag 'net-6.1-rc8-2' of ↵Linus Torvalds13-29/+83
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and wifi. Current release - new code bugs: - eth: mlx5e: - use kvfree() in mlx5e_accel_fs_tcp_create() - MACsec, fix RX data path 16 RX security channel limit - MACsec, fix memory leak when MACsec device is deleted - MACsec, fix update Rx secure channel active field - MACsec, fix add Rx security association (SA) rule memory leak Previous releases - regressions: - wifi: cfg80211: don't allow multi-BSSID in S1G - stmmac: set MAC's flow control register to reflect current settings - eth: mlx5: - E-switch, fix duplicate lag creation - fix use-after-free when reverting termination table Previous releases - always broken: - ipv4: fix route deletion when nexthop info is not specified - bpf: fix a local storage BPF map bug where the value's spin lock field can get initialized incorrectly - tipc: re-fetch skb cb after tipc_msg_validate - wifi: wilc1000: fix Information Element parsing - packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE - sctp: fix memory leak in sctp_stream_outq_migrate() - can: can327: fix potential skb leak when netdev is down - can: add number of missing netdev freeing on error paths - aquantia: do not purge addresses when setting the number of rings - wwan: iosm: - fix incorrect skb length leading to truncated packet - fix crash in peek throughput test due to skb UAF" * tag 'net-6.1-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) net: ethernet: renesas: ravb: Fix promiscuous mode after system resumed MAINTAINERS: Update maintainer list for chelsio drivers ionic: update MAINTAINERS entry sctp: fix memory leak in sctp_stream_outq_migrate() packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE net/mlx5: Lag, Fix for loop when checking lag Revert "net/mlx5e: MACsec, remove replay window size limitation in offload path" net: marvell: prestera: Fix a NULL vs IS_ERR() check in some functions net: tun: Fix use-after-free in tun_detach() net: mdiobus: fix unbalanced node reference count net: hsr: Fix potential use-after-free tipc: re-fetch skb cb after tipc_msg_validate mptcp: fix sleep in atomic at close time mptcp: don't orphan ssk in mptcp_close() dsa: lan9303: Correct stat name ipv4: Fix route deletion when nexthop info is not specified net: wwan: iosm: fix incorrect skb length net: wwan: iosm: fix crash in peek throughput test net: wwan: iosm: fix dma_alloc_coherent incompatible pointer type net: wwan: iosm: fix kernel test robot reported error ...
2022-11-29udp_tunnel: Add checks for nla_nest_start() in __udp_tunnel_nic_dump_write()Yuan Can1-0/+2
As the nla_nest_start() may fail with NULL returned, the return value should be checked. Note that this is not a real bug, nothing will break here. The next nla_put() will fail as well and we'll bail (and nla_nest_cancel() can handle NULL). But we keep getting those "fixes" so whatever. Signed-off-by: Yuan Can <yuancan@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20221129013934.55184-1-yuancan@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29sctp: fix memory leak in sctp_stream_outq_migrate()Zhengchao Shao4-7/+47
When sctp_stream_outq_migrate() is called to release stream out resources, the memory pointed to by prio_head in stream out is not released. The memory leak information is as follows: unreferenced object 0xffff88801fe79f80 (size 64): comm "sctp_repo", pid 7957, jiffies 4294951704 (age 36.480s) hex dump (first 32 bytes): 80 9f e7 1f 80 88 ff ff 80 9f e7 1f 80 88 ff ff ................ 90 9f e7 1f 80 88 ff ff 90 9f e7 1f 80 88 ff ff ................ backtrace: [<ffffffff81b215c6>] kmalloc_trace+0x26/0x60 [<ffffffff88ae517c>] sctp_sched_prio_set+0x4cc/0x770 [<ffffffff88ad64f2>] sctp_stream_init_ext+0xd2/0x1b0 [<ffffffff88aa2604>] sctp_sendmsg_to_asoc+0x1614/0x1a30 [<ffffffff88ab7ff1>] sctp_sendmsg+0xda1/0x1ef0 [<ffffffff87f765ed>] inet_sendmsg+0x9d/0xe0 [<ffffffff8754b5b3>] sock_sendmsg+0xd3/0x120 [<ffffffff8755446a>] __sys_sendto+0x23a/0x340 [<ffffffff87554651>] __x64_sys_sendto+0xe1/0x1b0 [<ffffffff89978b49>] do_syscall_64+0x39/0xb0 [<ffffffff89a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Link: https://syzkaller.appspot.com/bug?exrid=29c402e56c4760763cc0 Fixes: 637784ade221 ("sctp: introduce priority based stream scheduler") Reported-by: syzbot+29c402e56c4760763cc0@syzkaller.appspotmail.com Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/20221126031720.378562-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETEWillem de Bruijn1-4/+2
CHECKSUM_COMPLETE signals that skb->csum stores the sum over the entire packet. It does not imply that an embedded l4 checksum field has been validated. Fixes: 682f048bd494 ("af_packet: pass checksum validation status to the user") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20221128161812.640098-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29net: devlink: add WARN_ON_ONCE to check return value of ↵Jiri Pirko1-2/+2
unregister_netdevice_notifier_net() call As the return value is not 0 only in case there is no such notifier block registered, add a WARN_ON_ONCE() to yell about it. Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20221125100255.1786741-1-jiri@resnulli.us Signed-off-by: Paolo Abeni <pabeni@redhat.com>