<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/arch/parisc/include/uapi/asm/socket.h, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-06-12T09:45:50+00:00</updated>
<entry>
<title>net: core: add getsockopt SO_PEERPIDFD</title>
<updated>2023-06-12T09:45:50+00:00</updated>
<author>
<name>Alexander Mikhalitsyn</name>
<email>aleksandr.mikhalitsyn@canonical.com</email>
</author>
<published>2023-06-08T20:26:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7b26952a91cf65ff1cc867a2382a8964d8c0ee7d'/>
<id>urn:sha1:7b26952a91cf65ff1cc867a2382a8964d8c0ee7d</id>
<content type='text'>
Add SO_PEERPIDFD which allows to get pidfd of peer socket holder pidfd.
This thing is direct analog of SO_PEERCRED which allows to get plain PID.

Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: Paolo Abeni &lt;pabeni@redhat.com&gt;
Cc: Leon Romanovsky &lt;leon@kernel.org&gt;
Cc: David Ahern &lt;dsahern@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Cc: Lennart Poettering &lt;mzxreary@0pointer.de&gt;
Cc: Luca Boccassi &lt;bluca@debian.org&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Stanislav Fomichev &lt;sdf@google.com&gt;
Cc: bpf@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Tested-by: Luca Boccassi &lt;bluca@debian.org&gt;
Signed-off-by: Alexander Mikhalitsyn &lt;aleksandr.mikhalitsyn@canonical.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>scm: add SO_PASSPIDFD and SCM_PIDFD</title>
<updated>2023-06-12T09:45:49+00:00</updated>
<author>
<name>Alexander Mikhalitsyn</name>
<email>aleksandr.mikhalitsyn@canonical.com</email>
</author>
<published>2023-06-08T20:26:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5e2ff6704a275be009be8979af17c52361b79b89'/>
<id>urn:sha1:5e2ff6704a275be009be8979af17c52361b79b89</id>
<content type='text'>
Implement SCM_PIDFD, a new type of CMSG type analogical to SCM_CREDENTIALS,
but it contains pidfd instead of plain pid, which allows programmers not
to care about PID reuse problem.

We mask SO_PASSPIDFD feature if CONFIG_UNIX is not builtin because
it depends on a pidfd_prepare() API which is not exported to the kernel
modules.

Idea comes from UAPI kernel group:
https://uapi-group.org/kernel-features/

Big thanks to Christian Brauner and Lennart Poettering for productive
discussions about this.

Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: Paolo Abeni &lt;pabeni@redhat.com&gt;
Cc: Leon Romanovsky &lt;leon@kernel.org&gt;
Cc: David Ahern &lt;dsahern@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Cc: Lennart Poettering &lt;mzxreary@0pointer.de&gt;
Cc: Luca Boccassi &lt;bluca@debian.org&gt;
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Tested-by: Luca Boccassi &lt;bluca@debian.org&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Alexander Mikhalitsyn &lt;aleksandr.mikhalitsyn@canonical.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: SO_RCVMARK socket option for SO_MARK with recvmsg()</title>
<updated>2022-04-28T20:08:15+00:00</updated>
<author>
<name>Erin MacNeil</name>
<email>lnx.erin@gmail.com</email>
</author>
<published>2022-04-27T20:02:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6fd1d51cfa253b5ee7dae18d7cf1df830e9b6137'/>
<id>urn:sha1:6fd1d51cfa253b5ee7dae18d7cf1df830e9b6137</id>
<content type='text'>
Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK
should be included in the ancillary data returned by recvmsg().

Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs().

Signed-off-by: Erin MacNeil &lt;lnx.erin@gmail.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Acked-by: Marc Kleine-Budde &lt;mkl@pengutronix.de&gt;
Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>txhash: Add socket option to control TX hash rethink behavior</title>
<updated>2022-01-31T15:05:25+00:00</updated>
<author>
<name>Akhmat Karakotov</name>
<email>hmukos@yandex-team.ru</email>
</author>
<published>2022-01-31T13:31:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=26859240e4ee701e0379f08634957adaff67e43a'/>
<id>urn:sha1:26859240e4ee701e0379f08634957adaff67e43a</id>
<content type='text'>
Add the SO_TXREHASH socket option to control hash rethink behavior per socket.
When default mode is set, sockets disable rehash at initialization and use
sysctl option when entering listen state. setsockopt() overrides default
behavior.

Signed-off-by: Akhmat Karakotov &lt;hmukos@yandex-team.ru&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>fix up for "net: add new socket option SO_RESERVE_MEM"</title>
<updated>2021-10-01T14:00:21+00:00</updated>
<author>
<name>Stephen Rothwell</name>
<email>sfr@canb.auug.org.au</email>
</author>
<published>2021-10-01T13:43:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=10d48705d5afb854d2edf3e17a3fb222001425d6'/>
<id>urn:sha1:10d48705d5afb854d2edf3e17a3fb222001425d6</id>
<content type='text'>
Some architectures do not include uapi/asm/socket.h

Fixes: 2bb2f5fb21b0 ("net: add new socket option SO_RESERVE_MEM")
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sock: allow reading and changing sk_userlocks with setsockopt</title>
<updated>2021-08-04T11:52:03+00:00</updated>
<author>
<name>Pavel Tikhomirov</name>
<email>ptikhomirov@virtuozzo.com</email>
</author>
<published>2021-08-04T07:55:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=04190bf8944deb7e3ac165a1a494db23aa0160a9'/>
<id>urn:sha1:04190bf8944deb7e3ac165a1a494db23aa0160a9</id>
<content type='text'>
SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags disable automatic socket
buffers adjustment done by kernel (see tcp_fixup_rcvbuf() and
tcp_sndbuf_expand()). If we've just created a new socket this adjustment
is enabled on it, but if one changes the socket buffer size by
setsockopt(SO_{SND,RCV}BUF*) it becomes disabled.

CRIU needs to call setsockopt(SO_{SND,RCV}BUF*) on each socket on
restore as it first needs to increase buffer sizes for packet queues
restore and second it needs to restore back original buffer sizes. So
after CRIU restore all sockets become non-auto-adjustable, which can
decrease network performance of restored applications significantly.

CRIU need to be able to restore sockets with enabled/disabled adjustment
to the same state it was before dump, so let's add special setsockopt
for it.

Let's also export SOCK_SNDBUF_LOCK and SOCK_RCVBUF_LOCK flags to uAPI so
that using these interface one can reenable automatic socket buffer
adjustment on their sockets.

Signed-off-by: Pavel Tikhomirov &lt;ptikhomirov@virtuozzo.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: retrieve netns cookie via getsocketopt</title>
<updated>2021-06-24T18:13:05+00:00</updated>
<author>
<name>Martynas Pumputis</name>
<email>m@lambda.lt</email>
</author>
<published>2021-06-23T13:56:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e8b9eab99232c4e62ada9d7976c80fd5e8118289'/>
<id>urn:sha1:e8b9eab99232c4e62ada9d7976c80fd5e8118289</id>
<content type='text'>
It's getting more common to run nested container environments for
testing cloud software. One of such examples is Kind [1] which runs a
Kubernetes cluster in Docker containers on a single host. Each container
acts as a Kubernetes node, and thus can run any Pod (aka container)
inside the former. This approach simplifies testing a lot, as it
eliminates complicated VM setups.

Unfortunately, such a setup breaks some functionality when cgroupv2 BPF
programs are used for load-balancing. The load-balancer BPF program
needs to detect whether a request originates from the host netns or a
container netns in order to allow some access, e.g. to a service via a
loopback IP address. Typically, the programs detect this by comparing
netns cookies with the one of the init ns via a call to
bpf_get_netns_cookie(NULL). However, in nested environments the latter
cannot be used given the Kubernetes node's netns is outside the init ns.
To fix this, we need to pass the Kubernetes node netns cookie to the
program in a different way: by extending getsockopt() with a
SO_NETNS_COOKIE option, the orchestrator which runs in the Kubernetes
node netns can retrieve the cookie and pass it to the program instead.

Thus, this is following up on Eric's commit 3d368ab87cf6 ("net:
initialize net-&gt;net_cookie at netns setup") to allow retrieval via
SO_NETNS_COOKIE.  This is also in line in how we retrieve socket cookie
via SO_COOKIE.

  [1] https://kind.sigs.k8s.io/

Signed-off-by: Lorenz Bauer &lt;lmb@cloudflare.com&gt;
Signed-off-by: Martynas Pumputis &lt;m@lambda.lt&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Add SO_BUSY_POLL_BUDGET socket option</title>
<updated>2020-11-30T23:09:25+00:00</updated>
<author>
<name>Björn Töpel</name>
<email>bjorn.topel@intel.com</email>
</author>
<published>2020-11-30T18:51:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7c951cafc0cb2e575f1d58677b95ac387ac0a5bd'/>
<id>urn:sha1:7c951cafc0cb2e575f1d58677b95ac387ac0a5bd</id>
<content type='text'>
This option lets a user set a per socket NAPI budget for
busy-polling. If the options is not set, it will use the default of 8.

Signed-off-by: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20201130185205.196029-3-bjorn.topel@gmail.com
</content>
</entry>
<entry>
<title>net: Introduce preferred busy-polling</title>
<updated>2020-11-30T23:09:25+00:00</updated>
<author>
<name>Björn Töpel</name>
<email>bjorn.topel@intel.com</email>
</author>
<published>2020-11-30T18:51:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7fd3253a7de6a317a0683f83739479fb880bffc8'/>
<id>urn:sha1:7fd3253a7de6a317a0683f83739479fb880bffc8</id>
<content type='text'>
The existing busy-polling mode, enabled by the SO_BUSY_POLL socket
option or system-wide using the /proc/sys/net/core/busy_read knob, is
an opportunistic. That means that if the NAPI context is not
scheduled, it will poll it. If, after busy-polling, the budget is
exceeded the busy-polling logic will schedule the NAPI onto the
regular softirq handling.

One implication of the behavior above is that a busy/heavy loaded NAPI
context will never enter/allow for busy-polling. Some applications
prefer that most NAPI processing would be done by busy-polling.

This series adds a new socket option, SO_PREFER_BUSY_POLL, that works
in concert with the napi_defer_hard_irqs and gro_flush_timeout
knobs. The napi_defer_hard_irqs and gro_flush_timeout knobs were
introduced in commit 6f8b12d661d0 ("net: napi: add hard irqs deferral
feature"), and allows for a user to defer interrupts to be enabled and
instead schedule the NAPI context from a watchdog timer. When a user
enables the SO_PREFER_BUSY_POLL, again with the other knobs enabled,
and the NAPI context is being processed by a softirq, the softirq NAPI
processing will exit early to allow the busy-polling to be performed.

If the application stops performing busy-polling via a system call,
the watchdog timer defined by gro_flush_timeout will timeout, and
regular softirq handling will resume.

In summary; Heavy traffic applications that prefer busy-polling over
softirq processing should use this option.

Example usage:

  $ echo 2 | sudo tee /sys/class/net/ens785f1/napi_defer_hard_irqs
  $ echo 200000 | sudo tee /sys/class/net/ens785f1/gro_flush_timeout

Note that the timeout should be larger than the userspace processing
window, otherwise the watchdog will timeout and fall back to regular
softirq processing.

Enable the SO_BUSY_POLL/SO_PREFER_BUSY_POLL options on your socket.

Signed-off-by: Björn Töpel &lt;bjorn.topel@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20201130185205.196029-2-bjorn.topel@gmail.com
</content>
</entry>
<entry>
<title>bpf: net: Add SO_DETACH_REUSEPORT_BPF</title>
<updated>2019-06-14T23:21:19+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2019-06-13T22:00:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=99f3a064bc2e4bd5fe50218646c5be342f2ad18c'/>
<id>urn:sha1:99f3a064bc2e4bd5fe50218646c5be342f2ad18c</id>
<content type='text'>
There is SO_ATTACH_REUSEPORT_[CE]BPF but there is no DETACH.
This patch adds SO_DETACH_REUSEPORT_BPF sockopt.  The same
sockopt can be used to undo both SO_ATTACH_REUSEPORT_[CE]BPF.

reseport_detach_prog() is added and it is mostly a mirror
of the existing reuseport_attach_prog().  The differences are,
it does not call reuseport_alloc() and returns -ENOENT when
there is no old prog.

Cc: Craig Gallek &lt;kraig@google.com&gt;
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Reviewed-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
</feed>
