kernel/linux.git/net/core/netclassid_cgroup.c, branch v7.0.10

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-22T01:09:51+00:00

This was done entirely with mindless brute force, using git grep -l '\

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28+00:00

This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook

netclassid: use thread_group_leader(p) in update_classid_task()

2026-02-03T16:21:26+00:00

Cleanup and preparation to simplify planned future changes. Link: https://lkml.kernel.org/r/aXY_4NSP094-Cf-2@redhat.com Signed-off-by: Oleg Nesterov Cc: Alice Ryhl Cc: Boris Brezillon Cc: Christan König Cc: David S. Miller Cc: Eric Dumazet Cc: Felix Kuehling Cc: Jakub Kicinski Cc: Leon Romanovsky Cc: Paolo Abeni Cc: Simon Horman Cc: Steven Price Signed-off-by: Andrew Morton

net, bpf: Fix RCU usage in task_cls_state() for BPF programs

2025-06-11T19:30:29+00:00

The commit ee971630f20f ("bpf: Allow some trace helpers for all prog types") made bpf_get_cgroup_classid_curr helper available to all BPF program types, not just networking programs. This helper calls __task_get_classid() which internally calls task_cls_state() requiring rcu_read_lock_bh_held(). This works in networking/tc context where RCU BH is held, but triggers an RCU warning when called from other contexts like BPF syscall programs that run under rcu_read_lock_trace(): WARNING: suspicious RCU usage 6.15.0-rc4-syzkaller-g079e5c56a5c4 #0 Not tainted ----------------------------- net/core/netclassid_cgroup.c:24 suspicious rcu_dereference_check() usage! Fix this by also accepting rcu_read_lock_held() and rcu_read_lock_trace_held() as valid RCU contexts in the task_cls_state() function. This ensures the helper works correctly in all needed RCU contexts where it might be called, regular RCU, RCU BH (for networking), and RCU trace (for BPF syscall programs). Fixes: ee971630f20f ("bpf: Allow some trace helpers for all prog types") Reported-by: syzbot+b4169a1cfb945d2ed0ec@syzkaller.appspotmail.com Signed-off-by: Charalampos Mitrodimas Signed-off-by: Daniel Borkmann Acked-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20250611-rcu-fix-task_cls_state-v3-1-3d30e1de753f@posteo.net Closes: https://syzkaller.appspot.com/bug?extid=b4169a1cfb945d2ed0ec

cgroup, netclassid: on modifying netclassid in cgroup, only consider the main process.

2023-10-16T23:36:53+00:00

When modifying netclassid, the command("echo 0x100001 > net_cls.classid") will take more time on many threads of one process, because the process create many fds. for example, one process exists 28000 fds and 60000 threads, echo command will task 45 seconds. Now, we only consider the main process when exec "iterate_fd", and the time is about 52 milliseconds. Signed-off-by: Liansen Zhai Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20231012090330.29636-1-zhailiansen@kuaishou.com Signed-off-by: Jakub Kicinski

core: Variable type completion

2022-08-31T08:40:34+00:00

'unsigned int' is better than 'unsigned'. Signed-off-by: Xin Gao Signed-off-by: David S. Miller

bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode

2021-09-13T23:35:58+00:00

Fix cgroup v1 interference when non-root cgroup v2 BPF programs are used. Back in the days, commit bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") embedded per-socket cgroup information into sock->sk_cgrp_data and in order to save 8 bytes in struct sock made both mutually exclusive, that is, when cgroup v1 socket tagging (e.g. net_cls/net_prio) is used, then cgroup v2 falls back to the root cgroup in sock_cgroup_ptr() (&cgrp_dfl_root.cgrp). The assumption made was "there is no reason to mix the two and this is in line with how legacy and v2 compatibility is handled" as stated in bd1060a1d671. However, with Kubernetes more widely supporting cgroups v2 as well nowadays, this assumption no longer holds, and the possibility of the v1/v2 mixed mode with the v2 root fallback being hit becomes a real security issue. Many of the cgroup v2 BPF programs are also used for policy enforcement, just to pick _one_ example, that is, to programmatically deny socket related system calls like connect(2) or bind(2). A v2 root fallback would implicitly cause a policy bypass for the affected Pods. In production environments, we have recently seen this case due to various circumstances: i) a different 3rd party agent and/or ii) a container runtime such as [0] in the user's environment configuring legacy cgroup v1 net_cls tags, which triggered implicitly mentioned root fallback. Another case is Kubernetes projects like kind [1] which create Kubernetes nodes in a container and also add cgroup namespaces to the mix, meaning programs which are attached to the cgroup v2 root of the cgroup namespace get attached to a non-root cgroup v2 path from init namespace point of view. And the latter's root is out of reach for agents on a kind Kubernetes node to configure. Meaning, any entity on the node setting cgroup v1 net_cls tag will trigger the bypass despite cgroup v2 BPF programs attached to the namespace root. Generally, this mutual exclusiveness does not hold anymore in today's user environments and makes cgroup v2 usage from BPF side fragile and unreliable. This fix adds proper struct cgroup pointer for the cgroup v2 case to struct sock_cgroup_data in order to address these issues; this implicitly also fixes the tradeoffs being made back then with regards to races and refcount leaks as stated in bd1060a1d671, and removes the fallback, so that cgroup v2 BPF programs always operate as expected. [0] https://github.com/nestybox/sysbox/ [1] https://kind.sigs.k8s.io/ Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Acked-by: Stanislav Fomichev Acked-by: Tejun Heo Link: https://lore.kernel.org/bpf/20210913230759.2313-1-daniel@iogearbox.net

net: Remove the err argument from sock_from_file

2020-12-04T21:32:40+00:00

Currently, the sock_from_file prototype takes an "err" pointer that is either not set or set to -ENOTSOCK IFF the returned socket is NULL. This makes the error redundant and it is ignored by a few callers. This patch simplifies the API by letting callers deduce the error based on whether the returned socket is NULL or not. Suggested-by: Al Viro Signed-off-by: Florent Revest Signed-off-by: Daniel Borkmann Reviewed-by: KP Singh Link: https://lore.kernel.org/bpf/20201204113609.1850150-1-revest@google.com

cgroup, netclassid: remove double cond_resched

2020-04-21T22:44:30+00:00

Commit 018d26fcd12a ("cgroup, netclassid: periodically release file_lock on classid") added a second cond_resched to write_classid indirectly by update_classid_task. Remove the one in write_classid. Signed-off-by: Jiri Slaby Cc: Dmitry Yakunin Cc: Konstantin Khlebnikov Cc: David S. Miller Signed-off-by: David S. Miller

cgroup, netclassid: periodically release file_lock on classid updating

2020-03-10T01:13:39+00:00

In our production environment we have faced with problem that updating classid in cgroup with heavy tasks cause long freeze of the file tables in this tasks. By heavy tasks we understand tasks with many threads and opened sockets (e.g. balancers). This freeze leads to an increase number of client timeouts. This patch implements following logic to fix this issue: аfter iterating 1000 file descriptors file table lock will be released thus providing a time gap for socket creation/deletion. Now update is non atomic and socket may be skipped using calls: dup2(oldfd, newfd); close(oldfd); But this case is not typical. Moreover before this patch skip is possible too by hiding socket fd in unix socket buffer. New sockets will be allocated with updated classid because cgroup state is updated before start of the file descriptors iteration. So in common cases this patch has no side effects. Signed-off-by: Dmitry Yakunin Reviewed-by: Konstantin Khlebnikov Signed-off-by: David S. Miller