summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-02-17net/mlx5e: TC, Move flow hashtable to be per repPaul Blakey5-32/+43
To allow shared tc block offload between two or more reps of the same eswitch, move the tc flow hashtable to be per rep, instead of per eswitch. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-17net/mlx5e: E-Switch, Add support for tx_port_ts in switchdev modeAya Levin1-3/+15
When turning on tx_port_ts (private flag) a PTP-SQ is created. Consider this queue when adding rules matching SQs to VPORTs. Otherwise the traffic on this queue won't reach the wire. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-17net/mlx5e: E-Switch, Add PTP counters for uplink representorAya Levin3-1/+3
There is a configuration where the uplink interface is the synchronizer. Add PTP counters for this interface for monitoring. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-17net/mlx5e: RX, Restrict bulk size for small Striding RQsTariq Toukan5-2/+10
In RQs of type multi-packet WQE (Striding RQ), each WQE is relatively large (typically 256KB) but their number is relatively small (8 in default). Re-mapping the descriptors' buffers before re-posting them is done via UMR (User-Mode Memory Registration) operations. On the one hand, posting UMR WQEs in bulks reduces communication overhead with the HW and better utilizes its processing units. On the other hand, delaying the WQE repost operations for a small RQ (say, of 4 WQEs) might drastically hit its performance, causing packet drops due to no receive buffer, for high or bursty incoming packets rate. Here we restrict the bulk size for too small RQs. Effectively, with the current constants, RQ of size 4 (minimum allowed) would have no bulking, while larger RQs will continue working with bulks of 2. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-17net/mlx5e: Default to Striding RQ when not conflicting with CQE compressionTariq Toukan1-2/+3
CQE compression is turned on by default on slow pci systems to help reduce the load on pci. In this case, Striding RQ was turned off as CQEs of packets that span several strides were not compressed, significantly reducing the compression effectiveness. This issue does not exist when using the newer mini_cqe format "stride_index". Hence, allow defaulting to Striding RQ in this case. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-17net/mlx5e: Generalize packet merge error messageTariq Toukan1-2/+2
Update the old error message for LRO state modify with the new general name. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-17net/mlx5e: Add support for using xdp->data_metaAlex Liu1-5/+12
Add support for using xdp->data_meta for cross-program communication Pass "true" to the last argument of xdp_prepare_buff(). After SKB is built, call skb_metadata_set() if metadata was pushed. Signed-off-by: Alex Liu <liualex@fb.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-17net/mlx5e: Fix spelling mistake "supoported" -> "supported"Colin Ian King1-1/+1
There is a spelling mistake in a NL_SET_ERR_MSG_MOD error message. Fix it. Fixes: 3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-17Merge tag 'mediatek-drm-fixes-5.17' of ↵Dave Airlie1-83/+84
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes for Linux 5.17 1. Avoid EPROBE_DEFER loop with external bridge Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/1645027727-19554-1-git-send-email-chunkuang.hu@kernel.org
2022-02-17net: rtnetlink: rtnl_stats_get(): Emit an extack for unset filter_maskPetr Machata1-1/+3
Both get and dump handlers for RTM_GETSTATS require that a filter_mask, a mask of which attributes should be emitted in the netlink response, is unset. rtnl_stats_dump() does include an extack in the bounce, rtnl_stats_get() however does not. Fix the omission. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/01feb1f4bbd22a19f6629503c4f366aed6424567.1645020876.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17Merge branch 'mptcp-so_sndtimeo-and-misc-cleanup'Jakub Kicinski7-76/+64
Mat Martineau says: ==================== mptcp: SO_SNDTIMEO and misc. cleanup Patch 1 adds support for the SO_SNDTIMEO socket option on MPTCP sockets. The remaining patches are various small cleanups: Patch 2 removes an obsolete declaration. Patches 3 and 5 remove unnecessary function parameters. Patch 4 removes an extra cast. Patches 6 and 7 add some const and ro_after_init modifiers. Patch 8 removes extra storage of TCP helpers. ==================== Link: https://lore.kernel.org/r/20220216021130.171786-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: don't save tcp data_ready and write space callbacksFlorian Westphal2-8/+6
Assign the helpers directly rather than save/restore in the context structure. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: mark ops structures as ro_after_initFlorian Westphal1-8/+7
These structures are initialised from the init hooks, so we can't make them 'const'. But no writes occur afterwards, so we can use ro_after_init. Also, remove bogus EXPORT_SYMBOL, the only access comes from ip stack, not from kernel modules. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: constify a bunch of of helpersPaolo Abeni3-32/+32
A few pm-related helpers don't touch arguments which lacking the const modifier, let's constify them. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: drop port parameter of mptcp_pm_add_addr_signalGeliang Tang3-7/+7
Drop the port parameter of mptcp_pm_add_addr_signal() and reflect it to avoid passing too many parameters. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: drop unneeded type casts for hmacGeliang Tang2-5/+2
Drop the unneeded type casts to 'unsigned long long' for printing out the hmac values in add_addr_hmac_valid() and subflow_thmac_valid(). Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: drop unused sk in mptcp_get_optionsGeliang Tang3-10/+8
The parameter 'sk' became useless since the code using it was dropped from mptcp_get_options() in the commit 8d548ea1dd15 ("mptcp: do not set unconditionally csum_reqd on incoming opt"). Let's drop it. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: mptcp_parse_option is no longer exportedMatthieu Baerts1-6/+0
Options parsing in now done from mptcp_incoming_options(). mptcp_parse_option() has been removed from mptcp.h when CONFIG_MPTCP is defined but not when it is not. Fixes: cfde141ea3fa ("mptcp: move option parsing into mptcp_incoming_options()") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: add SNDTIMEO setsockopt supportGeliang Tang1-0/+2
Add setsockopt support for SO_SNDTIMEO_OLD and SO_SNDTIMEO_NEW to fix this error reported by the mptcp bpf selftest: (network_helpers.c:64: errno: Operation not supported) Failed to set SO_SNDTIMEO test_mptcp:FAIL:115 All error logs: (network_helpers.c:64: errno: Operation not supported) Failed to set SO_SNDTIMEO test_mptcp:FAIL:115 Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: Fix an ignored error return from dm9051_get_regs()Yang Li1-1/+1
The return from the call to dm9051_get_regs() is int, it can be a negative error code, however this is being assigned to an unsigned int variable 'ret', so making 'ret' an int. Eliminate the following coccicheck warning: ./drivers/net/ethernet/davicom/dm9051.c:527:5-8: WARNING: Unsigned expression compared with zero: ret < 0 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20220216014507.109117-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: sched: limit TC_ACT_REPEAT loopsEric Dumazet1-3/+10
We have been living dangerously, at the mercy of malicious users, abusing TC_ACT_REPEAT, as shown by this syzpot report [1]. Add an arbitrary limit (32) to the number of times an action can return TC_ACT_REPEAT. v2: switch the limit to 32 instead of 10. Use net_warn_ratelimited() instead of pr_err_once(). [1] (C repro available on demand) rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 1-...!: (10500 ticks this GP) idle=021/1/0x4000000000000000 softirq=5592/5592 fqs=0 (t=10502 jiffies g=5305 q=190) rcu: rcu_preempt kthread timer wakeup didn't happen for 10502 jiffies! g5305 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 rcu: Possible timer handling issue on cpu=0 timer-softirq=3527 rcu: rcu_preempt kthread starved for 10505 jiffies! g5305 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:I stack:29344 pid: 14 ppid: 2 flags:0x00004000 Call Trace: <TASK> context_switch kernel/sched/core.c:4986 [inline] __schedule+0xab2/0x4db0 kernel/sched/core.c:6295 schedule+0xd2/0x260 kernel/sched/core.c:6368 schedule_timeout+0x14a/0x2a0 kernel/time/timer.c:1881 rcu_gp_fqs_loop+0x186/0x810 kernel/rcu/tree.c:1963 rcu_gp_kthread+0x1de/0x320 kernel/rcu/tree.c:2136 kthread+0x2e9/0x3a0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> rcu: Stack dump where RCU GP kthread last ran: Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 PID: 3646 Comm: syz-executor358 Not tainted 5.17.0-rc3-syzkaller-00149-gbf8e59fd315f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:rep_nop arch/x86/include/asm/vdso/processor.h:13 [inline] RIP: 0010:cpu_relax arch/x86/include/asm/vdso/processor.h:18 [inline] RIP: 0010:pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:437 [inline] RIP: 0010:__pv_queued_spin_lock_slowpath+0x3b8/0xb40 kernel/locking/qspinlock.c:508 Code: 48 89 eb c6 45 01 01 41 bc 00 80 00 00 48 c1 e9 03 83 e3 07 41 be 01 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8d 2c 01 eb 0c <f3> 90 41 83 ec 01 0f 84 72 04 00 00 41 0f b6 45 00 38 d8 7f 08 84 RSP: 0018:ffffc9000283f1b0 EFLAGS: 00000206 RAX: 0000000000000003 RBX: 0000000000000000 RCX: 1ffff1100fc0071e RDX: 0000000000000001 RSI: 0000000000000201 RDI: 0000000000000000 RBP: ffff88807e0038f0 R08: 0000000000000001 R09: ffffffff8ffbf9ff R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000004c1e R13: ffffed100fc0071e R14: 0000000000000001 R15: ffff8880b9c3aa80 FS: 00005555562bf300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffdbfef12b8 CR3: 00000000723c2000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline] queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline] queued_spin_lock include/asm-generic/qspinlock.h:85 [inline] do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:115 spin_lock_bh include/linux/spinlock.h:354 [inline] sch_tree_lock include/net/sch_generic.h:610 [inline] sch_tree_lock include/net/sch_generic.h:605 [inline] prio_tune+0x3b9/0xb50 net/sched/sch_prio.c:211 prio_init+0x5c/0x80 net/sched/sch_prio.c:244 qdisc_create.constprop.0+0x44a/0x10f0 net/sched/sch_api.c:1253 tc_modify_qdisc+0x4c5/0x1980 net/sched/sch_api.c:1660 rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5594 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x539/0x7e0 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:725 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2413 ___sys_sendmsg+0xf3/0x170 net/socket.c:2467 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f7ee98aae99 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffdbfef12d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffdbfef1300 RCX: 00007f7ee98aae99 RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000003 RBP: 0000000000000000 R08: 000000000000000d R09: 000000000000000d R10: 000000000000000d R11: 0000000000000246 R12: 00007ffdbfef12f0 R13: 00000000000f4240 R14: 000000000004ca47 R15: 00007ffdbfef12e4 </TASK> INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 2.293 msecs NMI backtrace for cpu 1 CPU: 1 PID: 3260 Comm: kworker/1:3 Not tainted 5.17.0-rc3-syzkaller-00149-gbf8e59fd315f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: mld mld_ifc_work Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111 nmi_trigger_cpumask_backtrace+0x1b3/0x230 lib/nmi_backtrace.c:62 trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline] rcu_dump_cpu_stacks+0x25e/0x3f0 kernel/rcu/tree_stall.h:343 print_cpu_stall kernel/rcu/tree_stall.h:604 [inline] check_cpu_stall kernel/rcu/tree_stall.h:688 [inline] rcu_pending kernel/rcu/tree.c:3919 [inline] rcu_sched_clock_irq.cold+0x5c/0x759 kernel/rcu/tree.c:2617 update_process_times+0x16d/0x200 kernel/time/timer.c:1785 tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:226 tick_sched_timer+0x1b0/0x2d0 kernel/time/tick-sched.c:1428 __run_hrtimer kernel/time/hrtimer.c:1685 [inline] __hrtimer_run_queues+0x1c0/0xe50 kernel/time/hrtimer.c:1749 hrtimer_interrupt+0x31c/0x790 kernel/time/hrtimer.c:1811 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline] __sysvec_apic_timer_interrupt+0x146/0x530 arch/x86/kernel/apic/apic.c:1103 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1097 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638 RIP: 0010:__sanitizer_cov_trace_const_cmp4+0xc/0x70 kernel/kcov.c:286 Code: 00 00 00 48 89 7c 30 e8 48 89 4c 30 f0 4c 89 54 d8 20 48 89 10 5b c3 0f 1f 80 00 00 00 00 41 89 f8 bf 03 00 00 00 4c 8b 14 24 <89> f1 65 48 8b 34 25 00 70 02 00 e8 14 f9 ff ff 84 c0 74 4b 48 8b RSP: 0018:ffffc90002c5eea8 EFLAGS: 00000246 RAX: 0000000000000007 RBX: ffff88801c625800 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: ffff8880137d3100 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff874fcd88 R11: 0000000000000000 R12: ffff88801d692dc0 R13: ffff8880137d3104 R14: 0000000000000000 R15: ffff88801d692de8 tcf_police_act+0x358/0x11d0 net/sched/act_police.c:256 tcf_action_exec net/sched/act_api.c:1049 [inline] tcf_action_exec+0x1a6/0x530 net/sched/act_api.c:1026 tcf_exts_exec include/net/pkt_cls.h:326 [inline] route4_classify+0xef0/0x1400 net/sched/cls_route.c:179 __tcf_classify net/sched/cls_api.c:1549 [inline] tcf_classify+0x3e8/0x9d0 net/sched/cls_api.c:1615 prio_classify net/sched/sch_prio.c:42 [inline] prio_enqueue+0x3a7/0x790 net/sched/sch_prio.c:75 dev_qdisc_enqueue+0x40/0x300 net/core/dev.c:3668 __dev_xmit_skb net/core/dev.c:3756 [inline] __dev_queue_xmit+0x1f61/0x3660 net/core/dev.c:4081 neigh_hh_output include/net/neighbour.h:533 [inline] neigh_output include/net/neighbour.h:547 [inline] ip_finish_output2+0x14dc/0x2170 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:306 [inline] __ip_finish_output+0x396/0x650 net/ipv4/ip_output.c:288 ip_finish_output+0x32/0x200 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip_output+0x196/0x310 net/ipv4/ip_output.c:430 dst_output include/net/dst.h:451 [inline] ip_local_out+0xaf/0x1a0 net/ipv4/ip_output.c:126 iptunnel_xmit+0x628/0xa50 net/ipv4/ip_tunnel_core.c:82 geneve_xmit_skb drivers/net/geneve.c:966 [inline] geneve_xmit+0x10c8/0x3530 drivers/net/geneve.c:1077 __netdev_start_xmit include/linux/netdevice.h:4683 [inline] netdev_start_xmit include/linux/netdevice.h:4697 [inline] xmit_one net/core/dev.c:3473 [inline] dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3489 __dev_queue_xmit+0x2985/0x3660 net/core/dev.c:4116 neigh_hh_output include/net/neighbour.h:533 [inline] neigh_output include/net/neighbour.h:547 [inline] ip6_finish_output2+0xf7a/0x14f0 net/ipv6/ip6_output.c:126 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] __ip6_finish_output+0x61e/0xe90 net/ipv6/ip6_output.c:170 ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:451 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] mld_sendpack+0x9a3/0xe40 net/ipv6/mcast.c:1826 mld_send_cr net/ipv6/mcast.c:2127 [inline] mld_ifc_work+0x71c/0xdc0 net/ipv6/mcast.c:2659 process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307 worker_thread+0x657/0x1110 kernel/workqueue.c:2454 kthread+0x2e9/0x3a0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> ---------------- Code disassembly (best guess): 0: 48 89 eb mov %rbp,%rbx 3: c6 45 01 01 movb $0x1,0x1(%rbp) 7: 41 bc 00 80 00 00 mov $0x8000,%r12d d: 48 c1 e9 03 shr $0x3,%rcx 11: 83 e3 07 and $0x7,%ebx 14: 41 be 01 00 00 00 mov $0x1,%r14d 1a: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 21: fc ff df 24: 4c 8d 2c 01 lea (%rcx,%rax,1),%r13 28: eb 0c jmp 0x36 * 2a: f3 90 pause <-- trapping instruction 2c: 41 83 ec 01 sub $0x1,%r12d 30: 0f 84 72 04 00 00 je 0x4a8 36: 41 0f b6 45 00 movzbl 0x0(%r13),%eax 3b: 38 d8 cmp %bl,%al 3d: 7f 08 jg 0x47 3f: 84 .byte 0x84 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20220215235305.3272331-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17tipc: fix wrong notification node addressesJon Maloy1-5/+6
The previous bug fix had an unfortunate side effect that broke distribution of binding table entries between nodes. The updated tipc_sock_addr struct is also used further down in the same function, and there the old value is still the correct one. Fixes: 032062f363b4 ("tipc: fix wrong publisher node address in link publications") Signed-off-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20220216020009.3404578-1-jmaloy@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: dsa: lantiq_gswip: fix use after free in gswip_remove()Alexey Khoroshilov1-1/+1
of_node_put(priv->ds->slave_mii_bus->dev.of_node) should be done before mdiobus_free(priv->ds->slave_mii_bus). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Fixes: 0d120dfb5d67 ("net: dsa: lantiq_gswip: don't use devres for mdiobus") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/1644921768-26477-1-git-send-email-khoroshilov@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17ipv6: per-netns exclusive flowlabel checksWillem de Bruijn3-3/+9
Ipv6 flowlabels historically require a reservation before use. Optionally in exclusive mode (e.g., user-private). Commit 59c820b2317f ("ipv6: elide flowlabel check if no exclusive leases exist") introduced a fastpath that avoids this check when no exclusive leases exist in the system, and thus any flowlabel use will be granted. That allows skipping the control operation to reserve a flowlabel entirely. Though with a warning if the fast path fails: This is an optimization. Robust applications still have to revert to requesting leases if the fast path fails due to an exclusive lease. Still, this is subtle. Better isolate network namespaces from each other. Flowlabels are per-netns. Also record per-netns whether exclusive leases are in use. Then behavior does not change based on activity in other netns. Changes v2 - wrap in IS_ENABLED(CONFIG_IPV6) to avoid breakage if disabled Fixes: 59c820b2317f ("ipv6: elide flowlabel check if no exclusive leases exist") Link: https://lore.kernel.org/netdev/MWHPR2201MB1072BCCCFCE779E4094837ACD0329@MWHPR2201MB1072.namprd22.prod.outlook.com/ Reported-by: Congyu Liu <liu3101@purdue.edu> Signed-off-by: Willem de Bruijn <willemb@google.com> Tested-by: Congyu Liu <liu3101@purdue.edu> Link: https://lore.kernel.org/r/20220215160037.1976072-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: dsa: tag_8021q: only call skb_push/skb_pull around __skb_vlan_popVladimir Oltean1-2/+2
__skb_vlan_pop() needs skb->data to point at the mac_header, while skb_vlan_tag_present() and skb_vlan_tag_get() don't, because they don't look at skb->data at all. So we can avoid uselessly moving around skb->data for the case where the VLAN tag was offloaded by the DSA master. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220215204722.2134816-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: bridge: multicast: notify switchdev driver whenever MC processing gets ↵Oleksandr Mazur1-0/+4
disabled Whenever bridge driver hits the max capacity of MDBs, it disables the MC processing (by setting corresponding bridge option), but never notifies switchdev about such change (the notifiers are called only upon explicit setting of this option, through the registered netlink interface). This could lead to situation when Software MDB processing gets disabled, but this event never gets offloaded to the underlying Hardware. Fix this by adding a notify message in such case. Fixes: 147c1e9b902c ("switchdev: bridge: Offload multicast disabled") Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20220215165303.31908-1-oleksandr.mazur@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: ethernet: altera: cleanup commentsTom Rix2-5/+5
Replacements: queueing to queuing trasfer to transfer aditional to additional adaptor to adapter transactino to transaction Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20220215213802.3043178-1-trix@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net/smc: return ETIMEDOUT when smc_connect_clc() timeoutD. Wythe1-1/+7
When smc_connect_clc() times out, it will return -EAGAIN(tcp_recvmsg retuns -EAGAIN while timeout), then this value will passed to the application, which is quite confusing to the applications, makes inconsistency with TCP. From the manual of connect, ETIMEDOUT is more suitable, and this patch try convert EAGAIN to ETIMEDOUT in that case. Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/1644913490-21594-1-git-send-email-alibuda@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17net: hns3: Remove unused inline function hclge_is_reset_pending()YueHaibing1-5/+0
This is unused since commit 8e2288cad6cb ("net: hns3: refactor PF cmdq init and uninit APIs with new common APIs"). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Jie Wang <wangjie125@huawei.com> Link: https://lore.kernel.org/r/20220216113507.22368-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17bpftool: Fix C++ additions to skeletonAndrii Nakryiko1-7/+7
Mark C++-specific T::open() and other methods as static inline to avoid symbol redefinition when multiple files use the same skeleton header in an application. Fixes: bb8ffe61ea45 ("bpftool: Add C++-specific open/load/etc skeleton wrappers") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220216233540.216642-1-andrii@kernel.org
2022-02-17cifs: fix confusing unneeded warning message on smb2.1 and earlierSteve French1-5/+6
When mounting with SMB2.1 or earlier, even with nomultichannel, we log the confusing warning message: "CIFS: VFS: multichannel is not supported on this protocol version, use 3.0 or above" Fix this so that we don't log this unless they really are trying to mount with multichannel. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215608 Reported-by: Kim Scarborough <kim@scarborough.kim> Cc: stable@vger.kernel.org # 5.11+ Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-02-17bpftool: Fix pretty print dump for maps without BTF loadedJiri Olsa1-16/+15
The commit e5043894b21f ("bpftool: Use libbpf_get_error() to check error") fails to dump map without BTF loaded in pretty mode (-p option). Fixing this by making sure get_map_kv_btf won't fail in case there's no BTF available for the map. Fixes: e5043894b21f ("bpftool: Use libbpf_get_error() to check error") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220216092102.125448-1-jolsa@kernel.org
2022-02-16module: fix building with sysfs disabledDmitry Torokhov1-0/+2
Sysfs support might be disabled so we need to guard the code that instantiates "compression" attribute with an #ifdef. Fixes: b1ae6dc41eaa ("module: add in-kernel support for decompressing") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2022-02-16bpf: Fix crash due to out of bounds access into reg2btf_ids.Kumar Kartikeya Dwivedi1-2/+3
When commit e6ac2450d6de ("bpf: Support bpf program calling kernel function") added kfunc support, it defined reg2btf_ids as a cheap way to translate the verifier reg type to the appropriate btf_vmlinux BTF ID, however commit c25b2ae13603 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL") moved the __BPF_REG_TYPE_MAX from the last member of bpf_reg_type enum to after the base register types, and defined other variants using type flag composition. However, now, the direct usage of reg->type to index into reg2btf_ids may no longer fall into __BPF_REG_TYPE_MAX range, and hence lead to out of bounds access and kernel crash on dereference of bad pointer. Fixes: c25b2ae13603 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220216201943.624869-1-memxor@gmail.com
2022-02-16NFS: Do not report writeback errors in nfs_getattr()Trond Myklebust1-6/+3
The result of the writeback, whether it is an ENOSPC or an EIO, or anything else, does not inhibit the NFS client from reporting the correct file timestamps. Fixes: 79566ef018f5 ("NFS: Getattr doesn't require data sync semantics") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-02-16Merge tag 'mmc-v5.17-rc1-2' of ↵Linus Torvalds1-14/+14
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "Fix recovery logic for multi block I/O reads (MMC_READ_MULTIPLE_BLOCK)" * tag 'mmc-v5.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: block: fix read single on recovery logic
2022-02-16Merge branch 'libbpf: Implement BTFGen'Andrii Nakryiko10-113/+863
Mauricio Vásquez <mauricio@kinvolk.io> says: ==================== CO-RE requires to have BTF information describing the kernel types in order to perform the relocations. This is usually provided by the kernel itself when it's configured with CONFIG_DEBUG_INFO_BTF. However, this configuration is not enabled in all the distributions and it's not available on kernels before 5.12. It's possible to use CO-RE in kernels without CONFIG_DEBUG_INFO_BTF support by providing the BTF information from an external source. BTFHub[0] contains BTF files to each released kernel not supporting BTF, for the most popular distributions. Providing this BTF file for a given kernel has some challenges: 1. Each BTF file is a few MBs big, then it's not possible to ship the eBPF program with all the BTF files needed to run in different kernels. (The BTF files will be in the order of GBs if you want to support a high number of kernels) 2. Downloading the BTF file for the current kernel at runtime delays the start of the program and it's not always possible to reach an external host to download such a file. Providing the BTF file with the information about all the data types of the kernel for running an eBPF program is an overkill in many of the cases. Usually the eBPF programs access only some kernel fields. This series implements BTFGen support in bpftool. This idea was discussed during the "Towards truly portable eBPF"[1] presentation at Linux Plumbers 2021. There is a good example[2] on how to use BTFGen and BTFHub together to generate multiple BTF files, to each existing/supported kernel, tailored to one application. For example: a complex bpf object might support nearly 400 kernels by having BTF files summing only 1.5 MB. [0]: https://github.com/aquasecurity/btfhub/ [1]: https://www.youtube.com/watch?v=igJLKyP1lFk&t=2418s [2]: https://github.com/aquasecurity/btfhub/tree/main/tools Changelog: v6 > v7: - use array instead of hashmap to store ids - use btf__add_{struct,union}() instead of memcpy() - don't use fixed path for testing BTF file - update example to use DECLARE_LIBBPF_OPTS() v5 > v6: - use BTF structure to store used member/types instead of hashmaps - remove support for input/output folders - remove bpf_core_{created,free}_cand_cache() - reorganize commits to avoid having unused static functions - remove usage of libbpf_get_error() - fix some errno propagation issues - do not record full types for type-based relocations - add support for BTF_KIND_FUNC_PROTO - implement tests based on core_reloc ones v4 > v5: - move some checks before invoking prog->obj->gen_loader - use p_info() instead of printf() - improve command output - fix issue with record_relo_core() - implement bash completion - write man page - implement some tests v3 > v4: - parse BTF and BTF.ext sections in bpftool and use bpf_core_calc_relo_insn() directly - expose less internal details from libbpf to bpftool - implement support for enum-based relocations - split commits in a more granular way v2 > v3: - expose internal libbpf APIs to bpftool instead - implement btfgen in bpftool - drop btf__raw_data() from libbpf v1 > v2: - introduce bpf_object__prepare() and ‘record_core_relos’ to expose CO-RE relocations instead of bpf_object__reloc_info_gen() - rename btf__save_to_file() to btf__raw_data() v1: https://lore.kernel.org/bpf/20211027203727.208847-1-mauricio@kinvolk.io/ v2: https://lore.kernel.org/bpf/20211116164208.164245-1-mauricio@kinvolk.io/ v3: https://lore.kernel.org/bpf/20211217185654.311609-1-mauricio@kinvolk.io/ v4: https://lore.kernel.org/bpf/20220112142709.102423-1-mauricio@kinvolk.io/ v5: https://lore.kernel.org/bpf/20220128223312.1253169-1-mauricio@kinvolk.io/ v6: https://lore.kernel.org/bpf/20220209222646.348365-1-mauricio@kinvolk.io/ Mauricio Vásquez (6): libbpf: split bpf_core_apply_relo() libbpf: Expose bpf_core_{add,free}_cands() to bpftool bpftool: Add gen min_core_btf command bpftool: Implement "gen min_core_btf" logic bpftool: Implement btfgen_get_btf() selftests/bpf: Test "bpftool gen min_core_btf" ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-02-16selftests/bpf: Test "bpftool gen min_core_btf"Mauricio Vásquez1-2/+48
This commit reuses the core_reloc test to check if the BTF files generated with "bpftool gen min_core_btf" are correct. This introduces test_core_btfgen() that runs all the core_reloc tests, but this time the source BTF files are generated by using "bpftool gen min_core_btf". The goal of this test is to check that the generated files are usable, and not to check if the algorithm is creating an optimized BTF file. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-8-mauricio@kinvolk.io
2022-02-16tty: n_tty: do not look ahead for EOL character past the end of the bufferLinus Torvalds1-4/+2
Daniel Gibson reports that the n_tty code gets line termination wrong in very specific cases: "If you feed a line with exactly 64 chars + terminating newline, and directly afterwards (without reading) another line into a pseudo terminal, the the first read() on the other side will return the 64 char line *without* terminating newline, and the next read() will return the missing terminating newline AND the complete next line (if it fits in the buffer)" and bisected the behavior to commit 3b830a9c34d5 ("tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer"). Now, digging deeper, it turns out that the behavior isn't exactly new: what changed in commit 3b830a9c34d5 was that the tty line discipline .read() function is now passed an intermediate kernel buffer rather than the final user space buffer. And that intermediate kernel buffer is 64 bytes in size - thus that special case with exactly 64 bytes plus terminating newline. The same problem did exist before, but historically the boundary was not the 64-byte chunk, but the user-supplied buffer size, which is obviously generally bigger (and potentially bigger than N_TTY_BUF_SIZE, which would hide the issue entirely). The reason is that the n_tty canon_copy_from_read_buf() code would look ahead for the EOL character one byte further than it would actually copy. It would then decide that it had found the terminator, and unmark it as an EOL character - which in turn explains why the next read wouldn't then be terminated by it. Now, the reason it did all this in the first place is related to some historical and pretty obscure EOF behavior, see commit ac8f3bf8832a ("n_tty: Fix poll() after buffer-limited eof push read") and commit 40d5e0905a03 ("n_tty: Fix EOF push handling"). And the reason for the EOL confusion is that we treat EOF as a special EOL condition, with the EOL character being NUL (aka "__DISABLED_CHAR" in the kernel sources). So that EOF look-ahead also affects the normal EOL handling. This patch just removes the look-ahead that causes problems, because EOL is much more critical than the historical "EOF in the middle of a line that coincides with the end of the buffer" handling ever was. Now, it is possible that we should indeed re-introduce the "look at next character to see if it's a EOF" behavior, but if so, that should be done not at the kernel buffer chunk boundary in canon_copy_from_read_buf(), but at a higher level, when we run out of the user buffer. In particular, the place to do that would be at the top of 'n_tty_read()', where we check if it's a continuation of a previously started read, and there is no more buffer space left, we could decide to just eat the __DISABLED_CHAR at that point. But that would be a separate patch, because I suspect nobody actually cares, and I'd like to get a report about it before bothering. Fixes: 3b830a9c34d5 ("tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer") Fixes: ac8f3bf8832a ("n_tty: Fix poll() after buffer-limited eof push read") Fixes: 40d5e0905a03 ("n_tty: Fix EOF push handling") Link: https://bugzilla.kernel.org/show_bug.cgi?id=215611 Reported-and-tested-by: Daniel Gibson <metalcaedes@gmail.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-16bpftool: Gen min_core_btf explanation and examplesRafael David Tinoco1-0/+90
Add "min_core_btf" feature explanation and one example of how to use it to bpftool-gen man page. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-7-mauricio@kinvolk.io
2022-02-16bpftool: Implement btfgen_get_btf()Mauricio Vásquez1-1/+99
The last part of the BTFGen algorithm is to create a new BTF object with all the types that were recorded in the previous steps. This function performs two different steps: 1. Add the types to the new BTF object by using btf__add_type(). Some special logic around struct and unions is implemented to only add the members that are really used in the field-based relocations. The type ID on the new and old BTF objects is stored on a map. 2. Fix all the type IDs on the new BTF object by using the IDs saved in the previous step. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-6-mauricio@kinvolk.io
2022-02-16bpftool: Implement "gen min_core_btf" logicMauricio Vásquez2-6/+457
This commit implements the logic for the gen min_core_btf command. Specifically, it implements the following functions: - minimize_btf(): receives the path of a source and destination BTF files and a list of BPF objects. This function records the relocations for all objects and then generates the BTF file by calling btfgen_get_btf() (implemented in the following commit). - btfgen_record_obj(): loads the BTF and BTF.ext sections of the BPF objects and loops through all CO-RE relocations. It uses bpf_core_calc_relo_insn() from libbpf and passes the target spec to btfgen_record_reloc(), that calls one of the following functions depending on the relocation kind. - btfgen_record_field_relo(): uses the target specification to mark all the types that are involved in a field-based CO-RE relocation. In this case types resolved and marked recursively using btfgen_mark_type(). Only the struct and union members (and their types) involved in the relocation are marked to optimize the size of the generated BTF file. - btfgen_record_type_relo(): marks the types involved in a type-based CO-RE relocation. In this case no members for the struct and union types are marked as libbpf doesn't use them while performing this kind of relocation. Pointed types are marked as they are used by libbpf in this case. - btfgen_record_enumval_relo(): marks the whole enum type for enum-based relocations. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-5-mauricio@kinvolk.io
2022-02-16bpftool: Add gen min_core_btf commandMauricio Vásquez2-4/+44
This command is implemented under the "gen" command in bpftool and the syntax is the following: $ bpftool gen min_core_btf INPUT OUTPUT OBJECT [OBJECT...] INPUT is the file that contains all the BTF types for a kernel and OUTPUT is the path of the minimize BTF file that will be created with only the types needed by the objects. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-4-mauricio@kinvolk.io
2022-02-16libbpf: Expose bpf_core_{add,free}_cands() to bpftoolMauricio Vásquez2-7/+19
Expose bpf_core_add_cands() and bpf_core_free_cands() to handle candidates list. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-3-mauricio@kinvolk.io
2022-02-16libbpf: Split bpf_core_apply_relo()Mauricio Vásquez4-96/+109
BTFGen needs to run the core relocation logic in order to understand what are the types involved in a given relocation. Currently bpf_core_apply_relo() calculates and **applies** a relocation to an instruction. Having both operations in the same function makes it difficult to only calculate the relocation without patching the instruction. This commit splits that logic in two different phases: (1) calculate the relocation and (2) patch the instruction. For the first phase bpf_core_apply_relo() is renamed to bpf_core_calc_relo_insn() who is now only on charge of calculating the relocation, the second phase uses the already existing bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and the BTFGen will use only bpf_core_calc_relo_insn(). Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-2-mauricio@kinvolk.io
2022-02-16ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40Woody Suwalski1-0/+5
Add and ACPI idle power level limit for 32-bit ThinkPad T40. There is a regression on T40 introduced by commit d6b88ce2, starting with kernel 5.16: commit d6b88ce2eb9d2698eb24451eb92c0a1649b17bb1 Author: Richard Gong <richard.gong@amd.com> Date:   Wed Sep 22 08:31:16 2021 -0500 ACPI: processor idle: Allow playing dead in C3 state The above patch is trying to enter C3 state during init, what is causing a T40 system freeze. I have not found a similar issue on any other of my 32-bit machines. The fix is to add another exception to the processor_power_dmi_table[] list. As a result the dmesg shows as expected: [2.155398] ACPI: IBM ThinkPad T40 detected - limiting to C2 max_cstate. Override with "processor.max_cstate=9" [2.155404] ACPI: processor limited to max C-state 2 The fix is trivial and affects only vintage T40 systems. Fixes: d6b88ce2eb9d ("CPI: processor idle: Allow playing dead in C3 state") Signed-off-by: Woody Suwalski <wsuwalski@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Cc: 5.16+ <stable@vger.kernel.org> # 5.16+ [ rjw: New subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-16perf test: Fix arm64 perf_event_attr tests wrt --call-graph initializationGerman Gomez5-0/+24
The struct perf_event_attr is initialised differently in Arm64 when recording in call-graph fp mode, so update the relevant tests, and add two extra arm64-only tests. Before: $ perf test 17 -v 17: Setup struct perf_event_attr [...] running './tests/attr/test-record-graph-default' expected sample_type=295, got 4391 expected sample_regs_user=0, got 1073741824 FAILED './tests/attr/test-record-graph-default' - match failure test child finished with -1 ---- end ---- After: [...] running './tests/attr/test-record-graph-default-aarch64' test limitation 'aarch64' running './tests/attr/test-record-graph-fp-aarch64' test limitation 'aarch64' running './tests/attr/test-record-graph-default' test limitation '!aarch64' excluded architecture list ['aarch64'] skipped [aarch64] './tests/attr/test-record-graph-default' running './tests/attr/test-record-graph-fp' test limitation '!aarch64' excluded architecture list ['aarch64'] skipped [aarch64] './tests/attr/test-record-graph-fp' [...] Fixes: 7248e308a5758761 ("perf tools: Record ARM64 LR register automatically") Signed-off-by: German Gomez <german.gomez@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20220125104435.2737-1-german.gomez@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16libsubcmd: Fix use-after-free for realloc(..., 0)Kees Cook1-9/+2
GCC 12 correctly reports a potential use-after-free condition in the xrealloc helper. Fix the warning by avoiding an implicit "free(ptr)" when size == 0: In file included from help.c:12: In function 'xrealloc', inlined from 'add_cmdname' at help.c:24:2: subcmd-util.h:56:23: error: pointer may be used after 'realloc' [-Werror=use-after-free] 56 | ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~ subcmd-util.h:52:21: note: call to 'realloc' here 52 | void *ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~ subcmd-util.h:58:31: error: pointer may be used after 'realloc' [-Werror=use-after-free] 58 | ret = realloc(ptr, 1); | ^~~~~~~~~~~~~~~ subcmd-util.h:52:21: note: call to 'realloc' here 52 | void *ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~ Fixes: 2f4ce5ec1d447beb ("perf tools: Finalize subcmd independence") Reported-by: Valdis Klētnieks <valdis.kletnieks@vt.edu> Signed-off-by: Kees Kook <keescook@chromium.org> Tested-by: Valdis Klētnieks <valdis.kletnieks@vt.edu> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-hardening@vger.kernel.org Cc: Valdis Klētnieks <valdis.kletnieks@vt.edu> Link: http://lore.kernel.org/lkml/20220213182443.4037039-1-keescook@chromium.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16libperf: Fix perf_cpu_map__for_each_cpu macroJiri Olsa4-5/+18
Tzvetomir Stoyanov reported an issue with using macro perf_cpu_map__for_each_cpu using private perf_cpu object. The issue is caused by recent change that wrapped cpu in struct perf_cpu to distinguish it from cpu indexes. We need to make struct perf_cpu public. Add a simple test for using the perf_cpu_map__for_each_cpu macro. Fixes: 6d18804b963b78dc ("perf cpumap: Give CPUs their own type") Reported-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220215153713.31395-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf cs-etm: Fix corrupt inject files when only last branch option is enabledJames Clark1-0/+2
'perf inject' with Coresight data generates files that cannot be opened when only the last branch option is specified: perf inject -i perf.data --itrace=l -o inject.data perf script -i inject.data 0x33faa8 [0x8]: failed to process type: 9 [Bad address] This is because cs_etm__synth_instruction_sample() is called even when the sample type for instructions hasn't been setup. Last branch records are attached to instruction samples so it doesn't make sense to generate them when --itrace=i isn't specified anyway. This change disables all calls of cs_etm__synth_instruction_sample() unless --itrace=i is specified, resulting in a file with no samples if only --itrace=l is provided, rather than a bad file. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220210200620.1227232-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>