summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2022-09-05net: phy: Add 1000BASE-KX interface modeSean Anderson1-0/+4
Add 1000BASE-KX interface mode. This 1G backplane ethernet as described in clause 70. Clause 73 autonegotiation is mandatory, and only full duplex operation is supported. Although at the PMA level this interface mode is identical to 1000BASE-X, it uses a different form of in-band autonegation. This justifies a separate interface mode, since the interface mode (along with the MLO_AN_* autonegotiation mode) sets the type of autonegotiation which will be used on a link. This results in more than just electrical differences between the link modes. With regard to 1000BASE-X, 1000BASE-KX holds a similar position to SGMII: same signaling, but different autonegotiation. PCS drivers (which typically handle in-band autonegotiation) may only support 1000BASE-X, and not 1000BASE-KX. Similarly, the phy mode is used to configure serdes phys with phy_set_mode_ext. Due to the different electrical standards (SFI or XFI vs Clause 70), they will likely want to use different configuration. Adding a phy interface mode for 1000BASE-KX helps simplify configuration in these areas. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-05soc: fsl: qbman: Add CGR update functionSean Anderson1-0/+9
This adds a function to update a CGR with new parameters. qman_create_cgr can almost be used for this (with flags=0), but it's not suitable because it also registers the callback function. The _safe variant was modeled off of qman_cgr_delete_safe. However, we handle multiple arguments and a return value. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-05net: pcs: add new PCS driver for altera TSE PCSMaxime Chevallier1-0/+17
The Altera Triple Speed Ethernet has a SGMII/1000BaseC PCS that can be integrated in several ways. It can either be part of the TSE MAC's address space, accessed through 32 bits accesses on the mapped mdio device 0, or through a dedicated 16 bits register set. This driver allows using the TSE PCS outside of altera TSE's driver, since it can be used standalone by other MACs. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-04Merge tag 'wireless-next-2022-09-03' of ↵David S. Miller1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== drivers - rtw89: large update across the map, e.g. coex, pci(e), etc. - ath9k: uninit memory read fix - ath10k: small peer map fix and a WCN3990 device fix - wfx: underflow stack - the "change MAC address while IFF_UP" change from James we discussed - more MLO work, including a set of fixes for the previous code, now that we have more code we can exercise it more - prevent some features with MLO that aren't ready yet (AP_VLAN and 4-address connections) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03wifi: nl80211: add MLD address to assoc BSS entriesJohannes Berg1-0/+2
Add an MLD address attribute to BSS entries that the interface is currently associated with to help userspace figure out what's going on. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03wifi: nl80211: Add POWERED_ADDR_CHANGE featureJames Prestwood1-0/+9
Add a new extended feature bit signifying that the wireless hardware supports changing the MAC address while the underlying net_device is powered. Note that this has a different meaning from IFF_LIVE_ADDR_CHANGE as additional restrictions might be imposed by the hardware, such as: - No connection is active on this interface, carrier is off - No scan is in progress - No offchannel operations are in progress Signed-off-by: James Prestwood <prestwoj@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-03net/ipv4: Use __DECLARE_FLEX_ARRAY() helperGustavo A. R. Silva1-14/+6
We now have a cleaner way to keep compatibility with user-space (a.k.a. not breaking it) when we need to keep in place a one-element array (for its use in user-space) together with a flexible-array member (for its use in kernel-space) without making it hard to read at the source level. This is through the use of the new __DECLARE_FLEX_ARRAY() helper macro. The size and memory layout of the structure is preserved after the changes. See below. Before changes: $ pahole -C ip_msfilter net/ipv4/igmp.o struct ip_msfilter { union { struct { __be32 imsf_multiaddr_aux; /* 0 4 */ __be32 imsf_interface_aux; /* 4 4 */ __u32 imsf_fmode_aux; /* 8 4 */ __u32 imsf_numsrc_aux; /* 12 4 */ __be32 imsf_slist[1]; /* 16 4 */ }; /* 0 20 */ struct { __be32 imsf_multiaddr; /* 0 4 */ __be32 imsf_interface; /* 4 4 */ __u32 imsf_fmode; /* 8 4 */ __u32 imsf_numsrc; /* 12 4 */ __be32 imsf_slist_flex[0]; /* 16 0 */ }; /* 0 16 */ }; /* 0 20 */ /* size: 20, cachelines: 1, members: 1 */ /* last cacheline: 20 bytes */ }; After changes: $ pahole -C ip_msfilter net/ipv4/igmp.o struct ip_msfilter { __be32 imsf_multiaddr; /* 0 4 */ __be32 imsf_interface; /* 4 4 */ __u32 imsf_fmode; /* 8 4 */ __u32 imsf_numsrc; /* 12 4 */ union { __be32 imsf_slist[1]; /* 16 4 */ struct { struct { } __empty_imsf_slist_flex; /* 16 0 */ __be32 imsf_slist_flex[0]; /* 16 0 */ }; /* 16 0 */ }; /* 16 4 */ /* size: 20, cachelines: 1, members: 5 */ /* last cacheline: 20 bytes */ }; In the past, we had to duplicate the whole original structure within a union, and update the names of all the members. Now, we just need to declare the flexible-array member to be used in kernel-space through the __DECLARE_FLEX_ARRAY() helper together with the one-element array, within a union. This makes the source code more clean and easier to read. Link: https://github.com/KSPP/linux/issues/193 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-03net: ieee802154: Fix compilation error when ↵Gal Pressman1-4/+2
CONFIG_IEEE802154_NL802154_EXPERIMENTAL is disabled When CONFIG_IEEE802154_NL802154_EXPERIMENTAL is disabled, NL802154_CMD_DEL_SEC_LEVEL is undefined and results in a compilation error: net/ieee802154/nl802154.c:2503:19: error: 'NL802154_CMD_DEL_SEC_LEVEL' undeclared here (not in a function); did you mean 'NL802154_CMD_SET_CCA_ED_LEVEL'? 2503 | .resv_start_op = NL802154_CMD_DEL_SEC_LEVEL + 1, | ^~~~~~~~~~~~~~~~~~~~~~~~~~ | NL802154_CMD_SET_CCA_ED_LEVEL Unhide the experimental commands, having them defined in an enum makes no difference. Fixes: 9c5d03d36251 ("genetlink: start to validate reserved header bytes") Signed-off-by: Gal Pressman <gal@nvidia.com> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/r/20220902030620.2737091-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02net: remove netif_tx_napi_add()Jakub Kicinski1-2/+0
All callers are now gone. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-02net: bql: add more documentationEric Dumazet1-6/+26
Add some documentation for netdev_tx_sent_queue() and netdev_tx_completed_queue() Stating that netdev_tx_completed_queue() must be called once per TX completion round is apparently not obvious for everybody. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski24-33/+106
tools/testing/selftests/net/.gitignore sort the net-next version and use it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01Merge tag 'net-6.0-rc4' of ↵Linus Torvalds4-9/+13
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth, bpf and wireless. Current release - regressions: - bpf: - fix wrong last sg check in sk_msg_recvmsg() - fix kernel BUG in purge_effective_progs() - mac80211: - fix possible leak in ieee80211_tx_control_port() - potential NULL dereference in ieee80211_tx_control_port() Current release - new code bugs: - nfp: fix the access to management firmware hanging Previous releases - regressions: - ip: fix triggering of 'icmp redirect' - sched: tbf: don't call qdisc_put() while holding tree lock - bpf: fix corrupted packets for XDP_SHARED_UMEM - bluetooth: hci_sync: fix suspend performance regression - micrel: fix probe failure Previous releases - always broken: - tcp: make global challenge ack rate limitation per net-ns and default disabled - tg3: fix potential hang-up on system reboot - mac802154: fix reception for no-daddr packets Misc: - r8152: add PID for the lenovo onelink+ dock" * tag 'net-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (56 commits) net/smc: Remove redundant refcount increase Revert "sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb" tcp: make global challenge ack rate limitation per net-ns and default disabled tcp: annotate data-race around challenge_timestamp net: dsa: hellcreek: Print warning only once ip: fix triggering of 'icmp redirect' sch_cake: Return __NET_XMIT_STOLEN when consuming enqueued skb selftests: net: sort .gitignore file Documentation: networking: correct possessive "its" kcm: fix strp_init() order and cleanup mlxbf_gige: compute MDIO period based on i1clk ethernet: rocker: fix sleep in atomic context bug in neigh_timer_handler net: lan966x: improve error handle in lan966x_fdma_rx_get_frame() nfp: fix the access to management firmware hanging net: phy: micrel: Make the GPIO to be non-exclusive net: virtio_net: fix notification coalescing comments net/sched: fix netdevice reference leaks in attach_default_qdiscs() net: sched: tbf: don't call qdisc_put() while holding tree lock net: Use u64_stats_fetch_begin_irq() for stats fetch. net: dsa: xrs700x: Use irqsave variant for u64 stats update ...
2022-09-01tcp: make global challenge ack rate limitation per net-ns and default disabledEric Dumazet1-0/+2
Because per host rate limiting has been proven problematic (side channel attacks can be based on it), per host rate limiting of challenge acks ideally should be per netns and turned off by default. This is a long due followup of following commits: 083ae308280d ("tcp: enable per-socket rate limiting of all 'challenge acks'") f2b2c582e824 ("tcp: mitigate ACK loops for connections as tcp_sock") 75ff39ccc1bd ("tcp: make challenge acks less predictable") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason Baron <jbaron@akamai.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01net: sched: gred/red: remove unused variables in struct red_statsZhengchao Shao1-1/+0
The variable "other" in the struct red_stats is not used. Remove it. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01mm/rmap: Fix anon_vma->degree ambiguity leading to double-reuseJann Horn1-2/+5
anon_vma->degree tracks the combined number of child anon_vmas and VMAs that use the anon_vma as their ->anon_vma. anon_vma_clone() then assumes that for any anon_vma attached to src->anon_vma_chain other than src->anon_vma, it is impossible for it to be a leaf node of the VMA tree, meaning that for such VMAs ->degree is elevated by 1 because of a child anon_vma, meaning that if ->degree equals 1 there are no VMAs that use the anon_vma as their ->anon_vma. This assumption is wrong because the ->degree optimization leads to leaf nodes being abandoned on anon_vma_clone() - an existing anon_vma is reused and no new parent-child relationship is created. So it is possible to reuse an anon_vma for one VMA while it is still tied to another VMA. This is an issue because is_mergeable_anon_vma() and its callers assume that if two VMAs have the same ->anon_vma, the list of anon_vmas attached to the VMAs is guaranteed to be the same. When this assumption is violated, vma_merge() can merge pages into a VMA that is not attached to the corresponding anon_vma, leading to dangling page->mapping pointers that will be dereferenced during rmap walks. Fix it by separately tracking the number of child anon_vmas and the number of VMAs using the anon_vma as their ->anon_vma. Fixes: 7a3ef208e662 ("mm: prevent endless growth of anon_vma hierarchy") Cc: stable@kernel.org Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-31Merge tag 'fscache-fixes-20220831' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fscache/cachefiles fixes from David Howells: - Fix kdoc on fscache_use/unuse_cookie(). - Fix the error returned by cachefiles_ondemand_copen() from an upcall result. - Fix the distribution of requests in on-demand mode in cachefiles to be fairer by cycling through them rather than picking the one with the lowest ID each time (IDs being reused). * tag 'fscache-fixes-20220831' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: cachefiles: make on-demand request distribution fairer cachefiles: fix error return code in cachefiles_ondemand_copen() fscache: fix misdocumented parameter
2022-08-31Merge tag 'lsm-pr-20220829' of ↵Linus Torvalds3-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM support for IORING_OP_URING_CMD from Paul Moore: "Add SELinux and Smack controls to the io_uring IORING_OP_URING_CMD. These are necessary as without them the IORING_OP_URING_CMD remains outside the purview of the LSMs (Luis' LSM patch, Casey's Smack patch, and my SELinux patch). They have been discussed at length with the io_uring folks, and Jens has given his thumbs-up on the relevant patches (see the commit descriptions). There is one patch that is not strictly necessary, but it makes testing much easier and is very trivial: the /dev/null IORING_OP_URING_CMD patch." * tag 'lsm-pr-20220829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: Smack: Provide read control for io_uring_cmd /dev/null: add IORING_OP_URING_CMD support selinux: implement the security_uring_cmd() LSM hook lsm,io_uring: add LSM hooks for the new uring_cmd file op
2022-08-31fscache: fix misdocumented parameterKhalid Masum1-2/+2
This patch fixes two warnings generated by make docs. The functions fscache_use_cookie and fscache_unuse_cookie, both have a parameter named cookie. But they are documented with the name "object" with unclear description. Which generates the warning when creating docs. This commit will replace the currently misdocumented parameter names with the correct ones while adding proper descriptions. CC: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20220521142446.4746-1-khalid.masum.92@gmail.com/ # v1 Link: https://lore.kernel.org/r/20220818040738.12036-1-khalid.masum.92@gmail.com/ # v2 Link: https://lore.kernel.org/r/880d7d25753fb326ee17ac08005952112fcf9bdb.1657360984.git.mchehab@kernel.org/ # Mauro's version
2022-08-31thunderbolt: Show link type for XDomain connections tooMika Westerberg1-0/+2
Following what we do for routers already, extend this to XDomain connections as well. This will show in sysfs whether the link is in USB4 or Thunderbolt mode. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-31net: virtio_net: fix notification coalescing commentsAlvaro Karsz1-7/+7
Fix wording in comments for the notifications coalescing feature. Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20220823073947.14774-1-alvaro.karsz@solid-run.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-30net: devlink: stub port params cmds for they are unused internallyJiri Pirko1-1/+0
Follow-up the removal of unused internal api of port params made by commit 42ded61aa75e ("devlink: Delete not used port parameters APIs") and stub the commands and add extack message to tell the user what is going on. If later on port params are needed, could be easily re-introduced, but until then it is a dead code. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20220826082730.1399735-1-jiri@resnulli.us Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-30netlink: add helpers for extack attr presence checkingJakub Kicinski2-0/+18
Being able to check attribute presence and set extack if not on one line is handy, add helpers. Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-30netlink: add support for ext_ack missing attributesJakub Kicinski2-0/+19
There is currently no way to report via extack in a structured way that an attribute is missing. This leads to families resorting to string messages. Add a pair of attributes - @offset and @type for machine-readable way of reporting missing attributes. The @offset points to the nest which should have contained the attribute, @type is the expected nla_type. The offset will be skipped if the attribute is missing at the message level rather than inside a nest. User space should be able to figure out which attribute enum (AKA attribute space AKA attribute set) the nest pointed to by @offset is using. Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-29tracing: Define the is_signed_type() macro onceBart Van Assche3-3/+6
There are two definitions of the is_signed_type() macro: one in <linux/overflow.h> and a second definition in <linux/trace_events.h>. As suggested by Linus, move the definition of the is_signed_type() macro into the <linux/compiler.h> header file. Change the definition of the is_signed_type() macro to make sure that it does not trigger any sparse warnings with future versions of sparse for bitwise types. Link: https://lore.kernel.org/all/CAHk-=whjH6p+qzwUdx5SOVVHjS3WvzJQr6mDUwhEyTf6pJWzaQ@mail.gmail.com/ Link: https://lore.kernel.org/all/CAHk-=wjQGnVfb4jehFR0XyZikdQvCZouE96xR_nnf5kqaM5qqQ@mail.gmail.com/ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Steven Rostedt <rostedt@goodmis.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-29ethernet: Add helpers to recognize addresses mapped to IP multicastCasper Andersson1-0/+22
IP multicast must sometimes be discriminated from non-IP multicast, e.g. when determining the forwarding behavior of a given group in the presence of multicast router ports on an offloaded bridge. Therefore, provide helpers to identify these groups. Signed-off-by: Casper Andersson <casper.casan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-29genetlink: start to validate reserved header bytesJakub Kicinski2-0/+4
We had historically not checked that genlmsghdr.reserved is 0 on input which prevents us from using those precious bytes in the future. One use case would be to extend the cmd field, which is currently just 8 bits wide and 256 is not a lot of commands for some core families. To make sure that new families do the right thing by default put the onus of opting out of validation on existing families. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel) Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-29Merge tag 'mm-hotfixes-stable-2022-08-28' of ↵Linus Torvalds3-7/+28
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more hotfixes from Andrew Morton: "Seventeen hotfixes. Mostly memory management things. Ten patches are cc:stable, addressing pre-6.0 issues" * tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: .mailmap: update Luca Ceresoli's e-mail address mm/mprotect: only reference swap pfn page if type match squashfs: don't call kmalloc in decompressors mm/damon/dbgfs: avoid duplicate context directory creation mailmap: update email address for Colin King asm-generic: sections: refactor memory_intersects bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown Revert "memcg: cleanup racy sum avoidance code" mm/zsmalloc: do not attempt to free IS_ERR handle binder_alloc: add missing mmap_lock calls when using the VMA mm: re-allow pinning of zero pfns (again) vmcoreinfo: add kallsyms_num_syms symbol mailmap: update Guilherme G. Piccoli's email addresses writeback: avoid use-after-free after removing device shmem: update folio if shmem_replace_page() updates the page mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
2022-08-29asm-generic: sections: refactor memory_intersectsQuanyang Wang1-2/+5
There are two problems with the current code of memory_intersects: First, it doesn't check whether the region (begin, end) falls inside the region (virt, vend), that is (virt < begin && vend > end). The second problem is if vend is equal to begin, it will return true but this is wrong since vend (virt + size) is not the last address of the memory region but (virt + size -1) is. The wrong determination will trigger the misreporting when the function check_for_illegal_area calls memory_intersects to check if the dma region intersects with stext region. The misreporting is as below (stext is at 0x80100000): WARNING: CPU: 0 PID: 77 at kernel/dma/debug.c:1073 check_for_illegal_area+0x130/0x168 DMA-API: chipidea-usb2 e0002000.usb: device driver maps memory from kernel text or rodata [addr=800f0000] [len=65536] Modules linked in: CPU: 1 PID: 77 Comm: usb-storage Not tainted 5.19.0-yocto-standard #5 Hardware name: Xilinx Zynq Platform unwind_backtrace from show_stack+0x18/0x1c show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from __warn+0xb0/0x198 __warn from warn_slowpath_fmt+0x80/0xb4 warn_slowpath_fmt from check_for_illegal_area+0x130/0x168 check_for_illegal_area from debug_dma_map_sg+0x94/0x368 debug_dma_map_sg from __dma_map_sg_attrs+0x114/0x128 __dma_map_sg_attrs from dma_map_sg_attrs+0x18/0x24 dma_map_sg_attrs from usb_hcd_map_urb_for_dma+0x250/0x3b4 usb_hcd_map_urb_for_dma from usb_hcd_submit_urb+0x194/0x214 usb_hcd_submit_urb from usb_sg_wait+0xa4/0x118 usb_sg_wait from usb_stor_bulk_transfer_sglist+0xa0/0xec usb_stor_bulk_transfer_sglist from usb_stor_bulk_srb+0x38/0x70 usb_stor_bulk_srb from usb_stor_Bulk_transport+0x150/0x360 usb_stor_Bulk_transport from usb_stor_invoke_transport+0x38/0x440 usb_stor_invoke_transport from usb_stor_control_thread+0x1e0/0x238 usb_stor_control_thread from kthread+0xf8/0x104 kthread from ret_from_fork+0x14/0x2c Refactor memory_intersects to fix the two problems above. Before the 1d7db834a027e ("dma-debug: use memory_intersects() directly"), memory_intersects is called only by printk_late_init: printk_late_init -> init_section_intersects ->memory_intersects. There were few places where memory_intersects was called. When commit 1d7db834a027e ("dma-debug: use memory_intersects() directly") was merged and CONFIG_DMA_API_DEBUG is enabled, the DMA subsystem uses it to check for an illegal area and the calltrace above is triggered. [akpm@linux-foundation.org: fix nearby comment typo] Link: https://lkml.kernel.org/r/20220819081145.948016-1-quanyang.wang@windriver.com Fixes: 979559362516 ("asm/sections: add helpers to check for section data") Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thierry Reding <treding@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-29Revert "memcg: cleanup racy sum avoidance code"Shakeel Butt1-2/+13
This reverts commit 96e51ccf1af33e82f429a0d6baebba29c6448d0f. Recently we started running the kernel with rstat infrastructure on production traffic and begin to see negative memcg stats values. Particularly the 'sock' stat is the one which we observed having negative value. $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 18446744073708724224 Re-run after couple of seconds $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 53248 For now we are only seeing this issue on large machines (256 CPUs) and only with 'sock' stat. I think the networking stack increase the stat on one cpu and decrease it on another cpu much more often. So, this negative sock is due to rstat flusher flushing the stats on the CPU that has seen the decrement of sock but missed the CPU that has increments. A typical race condition. For easy stable backport, revert is the most simple solution. For long term solution, I am thinking of two directions. First is just reduce the race window by optimizing the rstat flusher. Second is if the reader sees a negative stat value, force flush and restart the stat collection. Basically retry but limited. Link: https://lkml.kernel.org/r/20220817172139.3141101-1-shakeelb@google.com Fixes: 96e51ccf1af33e8 ("memcg: cleanup racy sum avoidance code") Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: "Michal Koutný" <mkoutny@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: David Hildenbrand <david@redhat.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> [5.15] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-29mm: re-allow pinning of zero pfns (again)Alex Williamson1-3/+10
The below referenced commit makes the same error as 1c563432588d ("mm: fix is_pinnable_page against a cma page"), re-interpreting the logic to exclude pinning of the zero page, which breaks device assignment with vfio. To avoid further subtle mistakes, split the logic into discrete tests. [akpm@linux-foundation.org: simplify comment, per John] Link: https://lkml.kernel.org/r/166015037385.760108.16881097713975517242.stgit@omen Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Slawomir Laba <slawomirx.laba@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-27openvswitch: allow specifying ifindex of new interfacesAndrey Zhadchenko1-0/+3
CRIU is preserving ifindexes of net devices after restoration. However, current Open vSwitch API does not allow to target ifindex, so we cannot correctly restore OVS configuration. Add new OVS_DP_ATTR_IFINDEX for OVS_DP_CMD_NEW and use it as desired ifindex. Use OVS_VPORT_ATTR_IFINDEX during OVS_VPORT_CMD_NEW to specify new netdev ifindex. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com> Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-26Merge tag 'io_uring-6.0-2022-08-26' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+8
Pull io_uring fixes from Jens Axboe: - Add missing header file to the MAINTAINERS entry for io_uring (Ammar) - liburing and the kernel ship the same io_uring.h header, but one change we've had for a long time only in liburing is to ensure it's C++ safe. Add extern C around it, so we can more easily sync them in the future (Ammar) - Fix an off-by-one in the sync cancel added in this merge window (me) - Error handling fix for passthrough (Kanchan) - Fix for address saving for async execution for the zc tx support (Pavel) - Fix ordering for TCP zc notifications, so we always have them ordered correctly between "data was sent" and "data was acked". This isn't strictly needed with the notification slots, but we've been pondering disabling the slot support for 6.0 - and if we do, then we do require the ordering to be sane. Regardless of that, it's the sane thing to do in terms of API (Pavel) - Minor cleanup for indentation and lockdep annotation (Pavel) * tag 'io_uring-6.0-2022-08-26' of git://git.kernel.dk/linux-block: io_uring/net: save address for sendzc async execution io_uring: conditional ->async_data allocation io_uring/notif: order notif vs send CQEs io_uring/net: fix indentation io_uring/net: fix zc send link failing io_uring/net: fix must_hold annotation io_uring: fix submission-failure handling for uring-cmd io_uring: fix off-by-one in sync cancelation file check io_uring: uapi: Add `extern "C"` in io_uring.h for liburing MAINTAINERS: Add `include/linux/io_uring_types.h`
2022-08-26Merge tag 'scsi-fixes' of ↵Linus Torvalds1-5/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Ten fixes. Of the three core changes, the two large ones are a complete reversion of the async rework and an ALUA timing rework (the latter shouldn't affect non-ALUA paths). The remaining patches are all small and all but one in drivers" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Revert "Rework asynchronous resume support" scsi: core: Fix passthrough retry counter handling scsi: ufs: core: Reduce the power mode change timeout scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq scsi: ufs: host: ufs-exynos: Make fsd_ufs_drvs static scsi: megaraid_sas: Remove unnecessary kfree() scsi: megaraid_sas: Fix double kfree() scsi: ufs: core: Enable link lost interrupt scsi: core: Allow the ALUA transitioning state enough time scsi: qla2xxx: Disable ATIO interrupt coalesce for quad port ISP27XX
2022-08-26wait_on_bit: add an acquire memory barrierMikulas Patocka7-5/+34
There are several places in the kernel where wait_on_bit is not followed by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read). On architectures with weak memory ordering, it may happen that memory accesses that follow wait_on_bit are reordered before wait_on_bit and they may return invalid data. Fix this class of bugs by introducing a new function "test_bit_acquire" that works like test_bit, but has acquire memory ordering semantics. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-26lsm,io_uring: add LSM hooks for the new uring_cmd file opLuis Chamberlain3-0/+9
io-uring cmd support was added through ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd"), this extended the struct file_operations to allow a new command which each subsystem can use to enable command passthrough. Add an LSM specific for the command passthrough which enables LSMs to inspect the command details. This was discussed long ago without no clear pointer for something conclusive, so this enables LSMs to at least reject this new file operation. [0] https://lkml.kernel.org/r/8adf55db-7bab-f59d-d612-ed906b948d19@schaufler-ca.com Cc: stable@vger.kernel.org Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-2/+4
Daniel borkmann says: ==================== The following pull-request contains BPF updates for your *net* tree. We've added 11 non-merge commits during the last 14 day(s) which contain a total of 13 files changed, 61 insertions(+), 24 deletions(-). The main changes are: 1) Fix BPF verifier's precision tracking around BPF ring buffer, from Kumar Kartikeya Dwivedi. 2) Fix regression in tunnel key infra when passing FLOWI_FLAG_ANYSRC, from Eyal Birger. 3) Fix insufficient permissions for bpf_sys_bpf() helper, from YiFei Zhu. 4) Fix splat from hitting BUG when purging effective cgroup programs, from Pu Lehui. 5) Fix range tracking for array poke descriptors, from Daniel Borkmann. 6) Fix corrupted packets for XDP_SHARED_UMEM in aligned mode, from Magnus Karlsson. 7) Fix NULL pointer splat in BPF sockmap sk_msg_recvmsg(), from Liu Jian. 8) Add READ_ONCE() to bpf_jit_limit when reading from sysctl, from Kuniyuki Iwashima. 9) Add BPF selftest lru_bug check to s390x deny list, from Daniel Müller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-26net: sched: remove unnecessary init of qdisc skb headZhengchao Shao1-7/+0
The memory allocated by using kzallloc_node and kcalloc has been cleared. Therefore, the structure members of the new qdisc are 0. So there's no need to explicitly assign a value of 0. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-26Merge tag 'wireless-next-2022-08-26-v2' of ↵David S. Miller4-20/+60
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes berg says: ==================== Various updates: * rtw88: operation, locking, warning, and code style fixes * rtw89: small updates * cfg80211/mac80211: more EHT/MLO (802.11be, WiFi 7) work * brcmfmac: a couple of fixes * misc cleanups etc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski20-69/+104
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c 21234e3a84c7 ("net/mlx5e: Fix use after free in mlx5e_fs_init()") c7eafc5ed068 ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer") https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/ https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-26netdev: Use try_cmpxchg in napi_if_scheduled_mark_missedUros Bizjak1-2/+2
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in napi_if_scheduled_mark_missed. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails, enabling further code simplifications. Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20220822143243.2798-1-ubizjak@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-26Merge tag 'net-6.0-rc3' of ↵Linus Torvalds9-11/+26
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from ipsec and netfilter (with one broken Fixes tag). Current release - new code bugs: - dsa: don't dereference NULL extack in dsa_slave_changeupper() - dpaa: fix <1G ethernet on LS1046ARDB - neigh: don't call kfree_skb() under spin_lock_irqsave() Previous releases - regressions: - r8152: fix the RX FIFO settings when suspending - dsa: microchip: keep compatibility with device tree blobs with no phy-mode - Revert "net: macsec: update SCI upon MAC address change." - Revert "xfrm: update SA curlft.use_time", comply with RFC 2367 Previous releases - always broken: - netfilter: conntrack: work around exceeded TCP receive window - ipsec: fix a null pointer dereference of dst->dev on a metadata dst in xfrm_lookup_with_ifid - moxa: get rid of asymmetry in DMA mapping/unmapping - dsa: microchip: make learning configurable and keep it off while standalone - ice: xsk: prohibit usage of non-balanced queue id - rxrpc: fix locking in rxrpc's sendmsg Misc: - another chunk of sysctl data race silencing" * tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits) net: lantiq_xrx200: restore buffer if memory allocation failed net: lantiq_xrx200: fix lock under memory pressure net: lantiq_xrx200: confirm skb is allocated before using net: stmmac: work around sporadic tx issue on link-up ionic: VF initial random MAC address if no assigned mac ionic: fix up issues with handling EAGAIN on FW cmds ionic: clear broken state on generation change rxrpc: Fix locking in rxrpc's sendmsg net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2 MAINTAINERS: rectify file entry in BONDING DRIVER i40e: Fix incorrect address type for IPv6 flow rules ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter net: Fix a data-race around sysctl_somaxconn. net: Fix a data-race around netdev_unregister_timeout_secs. net: Fix a data-race around gro_normal_batch. net: Fix data-races around sysctl_devconf_inherit_init_net. net: Fix data-races around sysctl_fb_tunnels_only_for_init_net. net: Fix a data-race around netdev_budget_usecs. net: Fix data-races around sysctl_max_skb_frags. net: Fix a data-race around netdev_budget. ...
2022-08-25net: devlink: limit flash component name to match version returned by info_get()Jiri Pirko1-2/+1
Limit the acceptance of component name passed to cmd_flash_update() to match one of the versions returned by info_get(), marked by version type. This makes things clearer and enforces 1:1 mapping between exposed version and accepted flash component. Check VERSION_TYPE_COMPONENT version type during cmd_flash_update() execution by calling info_get() with different "req" context. That causes info_get() to lookup the component name instead of filling-up the netlink message. Remove "UPDATE_COMPONENT" flag which becomes used. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25net: devlink: extend info_get() version put to indicate a flash componentJiri Pirko1-0/+16
Whenever the driver is called by his info_get() op, it may put multiple version names and values to the netlink message. Extend by additional helper devlink_info_version_running/stored_put_ext() that allows to specify a version type that indicates when particular version name represents a flash component. This is going to be used in follow-up patch calling info_get() during flash update command checking if version with this the version type exists. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25net: sched: delete duplicate cleanup of backlog and qlenZhengchao Shao1-1/+0
qdisc_reset() is clearing qdisc->q.qlen and qdisc->qstats.backlog _after_ calling qdisc->ops->reset. There is no need to clear them again in the specific reset function. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20220824005231.345727-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-25wifi: cfg80211: Add link_id to cfg80211_ch_switch_started_notify()Veerendranath Jakkam1-1/+3
Add link_id parameter to cfg80211_ch_switch_started_notify() to allow driver to indicate on which link channel switch started on MLD. Send the data to userspace so it knows as well. Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com> Link: https://lore.kernel.org/r/20220722131143.3438042-1-quic_vjakkam@quicinc.com Link: https://lore.kernel.org/r/20220722131143.3438042-2-quic_vjakkam@quicinc.com [squash two patches] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-08-25wifi: mac80211: maintain link_id in link_staJohannes Berg1-0/+2
To helper drivers if they e.g. have a lookup of the link_sta pointer, add the link ID to the link_sta structure. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-08-25wifi: cfg80211/mac80211: check EHT capability size correctlyJohannes Berg1-4/+10
For AP/non-AP the EHT MCS/NSS subfield size differs, the 4-octet subfield is only used for 20 MHz-only non-AP STA. Pass an argument around everywhere to be able to parse it properly. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-08-25wifi: mac80211: add link information in ieee80211_rx_statusVasanthakumar Thiagarajan1-0/+5
In MLO, when the address translation from link to MLD is done in fw/hw, it is necessary to be able to have some information on the link on which the frame has been received. Extend the rx API to include link_id and a valid flag in ieee80211_rx_status. Also make chanes to mac80211 rx APIs to make use of the reported link_id after sanity checks. Signed-off-by: Vasanthakumar Thiagarajan <quic_vthiagar@quicinc.com> Link: https://lore.kernel.org/r/20220817104213.2531-2-quic_vthiagar@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-08-25wifi: mac80211: properly implement MLO key handlingJohannes Berg1-0/+2
Implement key installation and lookup (on TX and RX) for MLO, so we can use multiple GTKs/IGTKs/BIGTKs. Co-authored-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-08-25wifi: cfg80211: Add link_id parameter to various key operations for MLOVeerendranath Jakkam2-15/+36
Add support for various key operations on MLD by adding new parameter link_id. Pass the link_id received from userspace to driver for add_key, get_key, del_key, set_default_key, set_default_mgmt_key and set_default_beacon_key to support configuring keys specific to each MLO link. Userspace must not specify link ID for MLO pairwise key since it is common for all the MLO links. Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com> Link: https://lore.kernel.org/r/20220730052643.1959111-4-quic_vjakkam@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>