<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/uapi/linux/capability.h, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-07-05T00:21:53+00:00</updated>
<entry>
<title>uapi: fix broken link in linux/capability.h</title>
<updated>2025-07-05T00:21:53+00:00</updated>
<author>
<name>Ariel Otilibili</name>
<email>ariel.otilibili-anieli@eurecom.fr</email>
</author>
<published>2025-07-02T10:00:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cdd73b1666079a73d061396f361df55d59fe96e6'/>
<id>urn:sha1:cdd73b1666079a73d061396f361df55d59fe96e6</id>
<content type='text'>
The link to the libcap library is outdated. Instead, use a link to the
libcap2 library.

As well, give the complete reference of the POSIX compliance.

Signed-off-by: Ariel Otilibili &lt;ariel.otilibili-anieli@eurecom.fr&gt;
Acked-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Reviewed-by: Paul Moore &lt;paul@paul-moore.com&gt;
Signed-off-by: Serge Hallyn &lt;sergeh@kernel.org&gt;
</content>
</entry>
<entry>
<title>reboot: add support for configuring emergency hardware protection action</title>
<updated>2025-03-17T06:24:14+00:00</updated>
<author>
<name>Ahmad Fatoum</name>
<email>a.fatoum@pengutronix.de</email>
</author>
<published>2025-02-17T20:39:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e016173f656b89f5f71b7d45dfc599ada8107eef'/>
<id>urn:sha1:e016173f656b89f5f71b7d45dfc599ada8107eef</id>
<content type='text'>
We currently leave the decision of whether to shutdown or reboot to
protect hardware in an emergency situation to the individual drivers.

This works out in some cases, where the driver detecting the critical
failure has inside knowledge: It binds to the system management controller
for example or is guided by hardware description that defines what to do.

In the general case, however, the driver detecting the issue can't know
what the appropriate course of action is and shouldn't be dictating the
policy of dealing with it.

Therefore, add a global hw_protection toggle that allows the user to
specify whether shutdown or reboot should be the default action when the
driver doesn't set policy.

This introduces no functional change yet as hw_protection_trigger() has no
callers, but these will be added in subsequent commits.

[arnd@arndb.de: hide unused hw_protection_attr]
  Link: https://lkml.kernel.org/r/20250224141849.1546019-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20250217-hw_protection-reboot-v3-7-e1c09b090c0c@pengutronix.de
Signed-off-by: Ahmad Fatoum &lt;a.fatoum@pengutronix.de&gt;
Reviewed-by: Tzung-Bi Shih &lt;tzungbi@kernel.org&gt;
Cc: Benson Leung &lt;bleung@chromium.org&gt;
Cc: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: Fabio Estevam &lt;festevam@denx.de&gt;
Cc: Guenter Roeck &lt;groeck@chromium.org&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Liam Girdwood &lt;lgirdwood@gmail.com&gt;
Cc: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Matteo Croce &lt;teknoraver@meta.com&gt;
Cc: Matti Vaittinen &lt;mazziesaccount@gmail.com&gt;
Cc: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Cc: Rob Herring (Arm) &lt;robh@kernel.org&gt;
Cc: Rui Zhang &lt;rui.zhang@intel.com&gt;
Cc: Sascha Hauer &lt;kernel@pengutronix.de&gt;
Cc: "Serge E. Hallyn" &lt;serge@hallyn.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>capability: erase checker warnings about struct __user_cap_data_struct</title>
<updated>2023-06-06T21:05:54+00:00</updated>
<author>
<name>GONG, Ruiqi</name>
<email>gongruiqi@huaweicloud.com</email>
</author>
<published>2023-06-02T05:45:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=55382134366e641e97cd83264c22c60c7dc10ccd'/>
<id>urn:sha1:55382134366e641e97cd83264c22c60c7dc10ccd</id>
<content type='text'>
Currently Sparse warns the following when compiling kernel/capability.c:

kernel/capability.c:191:35: warning: incorrect type in argument 2
                              (different address spaces)
kernel/capability.c:191:35:    expected void const *from
kernel/capability.c:191:35:    got struct __user_cap_data_struct
                                 [noderef] __user *
kernel/capability.c:168:14: warning: dereference of noderef expression
...... (multiple noderef warnings on different locations)
kernel/capability.c:244:29: warning: incorrect type in argument 1
                              (different address spaces)
kernel/capability.c:244:29:    expected void *to
kernel/capability.c:244:29:    got struct __user_cap_data_struct
                                 [noderef] __user ( * )[2]
kernel/capability.c:247:42: warning: dereference of noderef expression
...... (multiple noderef warnings on different locations)

It seems that defining `struct __user_cap_data_struct` together with
`cap_user_data_t` make Sparse believe that the struct is `noderef` as
well. Separate their definitions to clarify their respective attributes.

Signed-off-by: GONG, Ruiqi &lt;gongruiqi@huaweicloud.com&gt;
Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
[PM: wrapped long lines in the description]
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
<entry>
<title>capabilities: fix undefined behavior in bit shift for CAP_TO_MASK</title>
<updated>2022-11-05T05:25:57+00:00</updated>
<author>
<name>Gaosheng Cui</name>
<email>cuigaosheng1@huawei.com</email>
</author>
<published>2022-10-31T11:25:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=46653972e3ea64f79e7f8ae3aa41a4d3fdb70a13'/>
<id>urn:sha1:46653972e3ea64f79e7f8ae3aa41a4d3fdb70a13</id>
<content type='text'>
Shifting signed 32-bit value by 31 bits is undefined, so changing
significant bit to unsigned. The UBSAN warning calltrace like below:

UBSAN: shift-out-of-bounds in security/commoncap.c:1252:2
left shift of 1 by 31 places cannot be represented in type 'int'
Call Trace:
 &lt;TASK&gt;
 dump_stack_lvl+0x7d/0xa5
 dump_stack+0x15/0x1b
 ubsan_epilogue+0xe/0x4e
 __ubsan_handle_shift_out_of_bounds+0x1e7/0x20c
 cap_task_prctl+0x561/0x6f0
 security_task_prctl+0x5a/0xb0
 __x64_sys_prctl+0x61/0x8f0
 do_syscall_64+0x58/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
 &lt;/TASK&gt;

Fixes: e338d263a76a ("Add 64-bit capability support to the kernel")
Signed-off-by: Gaosheng Cui &lt;cuigaosheng1@huawei.com&gt;
Acked-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Reviewed-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
<entry>
<title>exit/bdflush: Remove the deprecated bdflush system call</title>
<updated>2021-07-12T20:17:47+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2021-06-29T20:11:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b48c7236b13cb5ef1b5fdf744aa8841df0f7b43a'/>
<id>urn:sha1:b48c7236b13cb5ef1b5fdf744aa8841df0f7b43a</id>
<content type='text'>
The bdflush system call has been deprecated for a very long time.
Recently Michael Schmitz tested[1] and found that the last known
caller of of the bdflush system call is unaffected by it's removal.

Since the code is not needed delete it.

[1] https://lkml.kernel.org/r/36123b5d-daa0-6c2b-f2d4-a942f069fd54@gmail.com
Link: https://lkml.kernel.org/r/87sg10quue.fsf_-_@disp2133
Tested-by: Michael Schmitz &lt;schmitzmic@gmail.com&gt;
Acked-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reviewed-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Cyril Hrubis &lt;chrubis@suse.cz&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>capabilities: require CAP_SETFCAP to map uid 0</title>
<updated>2021-04-20T21:28:33+00:00</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serge@hallyn.com</email>
</author>
<published>2021-04-20T13:43:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=db2e718a47984b9d71ed890eb2ea36ecf150de18'/>
<id>urn:sha1:db2e718a47984b9d71ed890eb2ea36ecf150de18</id>
<content type='text'>
cap_setfcap is required to create file capabilities.

Since commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities"),
a process running as uid 0 but without cap_setfcap is able to work
around this as follows: unshare a new user namespace which maps parent
uid 0 into the child namespace.

While this task will not have new capabilities against the parent
namespace, there is a loophole due to the way namespaced file
capabilities are represented as xattrs.  File capabilities valid in
userns 1 are distinguished from file capabilities valid in userns 2 by
the kuid which underlies uid 0.  Therefore the restricted root process
can unshare a new self-mapping namespace, add a namespaced file
capability onto a file, then use that file capability in the parent
namespace.

To prevent that, do not allow mapping parent uid 0 if the process which
opened the uid_map file does not have CAP_SETFCAP, which is the
capability for setting file capabilities.

As a further wrinkle: a task can unshare its user namespace, then open
its uid_map file itself, and map (only) its own uid.  In this case we do
not have the credential from before unshare, which was potentially more
restricted.  So, when creating a user namespace, we record whether the
creator had CAP_SETFCAP.  Then we can use that during map_write().

With this patch:

1. Unprivileged user can still unshare -Ur

   ubuntu@caps:~$ unshare -Ur
   root@caps:~# logout

2. Root user can still unshare -Ur

   ubuntu@caps:~$ sudo bash
   root@caps:/home/ubuntu# unshare -Ur
   root@caps:/home/ubuntu# logout

3. Root user without CAP_SETFCAP cannot unshare -Ur:

   root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap --
   root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap
   unable to set CAP_SETFCAP effective capability: Operation not permitted
   root@caps:/home/ubuntu# unshare -Ur
   unshare: write failed /proc/self/uid_map: Operation not permitted

Note: an alternative solution would be to allow uid 0 mappings by
processes without CAP_SETFCAP, but to prevent such a namespace from
writing any file capabilities.  This approach can be seen at [1].

Background history: commit 95ebabde382 ("capabilities: Don't allow
writing ambiguous v3 file capabilities") tried to fix the issue by
preventing v3 fscaps to be written to disk when the root uid would map
to the same uid in nested user namespaces.  This led to regressions for
various workloads.  For example, see [2].  Ultimately this is a valid
use-case we have to support meaning we had to revert this change in
3b0c2d3eaa83 ("Revert 95ebabde382c ("capabilities: Don't allow writing
ambiguous v3 file capabilities")").

Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4 [1]
Link: https://github.com/containers/buildah/issues/3071 [2]
Signed-off-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Reviewed-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Tested-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Reviewed-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Tested-by: Giuseppe Scrivano &lt;gscrivan@redhat.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>block: grant IOPRIO_CLASS_RT to CAP_SYS_NICE</title>
<updated>2020-09-02T01:38:33+00:00</updated>
<author>
<name>Khazhismel Kumykov</name>
<email>khazhy@google.com</email>
</author>
<published>2020-08-24T22:10:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9d3a39a5f1e45827b008fff1ee9cf3cac3409665'/>
<id>urn:sha1:9d3a39a5f1e45827b008fff1ee9cf3cac3409665</id>
<content type='text'>
CAP_SYS_ADMIN is too broad, and ionice fits into CAP_SYS_NICE's grouping.

Retain CAP_SYS_ADMIN permission for backwards compatibility.

Signed-off-by: Khazhismel Kumykov &lt;khazhy@google.com&gt;
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>capabilities: Introduce CAP_CHECKPOINT_RESTORE</title>
<updated>2020-07-19T18:14:42+00:00</updated>
<author>
<name>Adrian Reber</name>
<email>areber@redhat.com</email>
</author>
<published>2020-07-19T10:04:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=124ea650d3072b005457faed69909221c2905a1f'/>
<id>urn:sha1:124ea650d3072b005457faed69909221c2905a1f</id>
<content type='text'>
This patch introduces CAP_CHECKPOINT_RESTORE, a new capability facilitating
checkpoint/restore for non-root users.

Over the last years, The CRIU (Checkpoint/Restore In Userspace) team has
been asked numerous times if it is possible to checkpoint/restore a
process as non-root. The answer usually was: 'almost'.

The main blocker to restore a process as non-root was to control the PID
of the restored process. This feature available via the clone3 system
call, or via /proc/sys/kernel/ns_last_pid is unfortunately guarded by
CAP_SYS_ADMIN.

In the past two years, requests for non-root checkpoint/restore have
increased due to the following use cases:
* Checkpoint/Restore in an HPC environment in combination with a
  resource manager distributing jobs where users are always running as
  non-root. There is a desire to provide a way to checkpoint and
  restore long running jobs.
* Container migration as non-root
* We have been in contact with JVM developers who are integrating
  CRIU into a Java VM to decrease the startup time. These
  checkpoint/restore applications are not meant to be running with
  CAP_SYS_ADMIN.

We have seen the following workarounds:
* Use a setuid wrapper around CRIU:
  See https://github.com/FredHutch/slurm-examples/blob/master/checkpointer/lib/checkpointer/checkpointer-suid.c
* Use a setuid helper that writes to ns_last_pid.
  Unfortunately, this helper delegation technique is impossible to use
  with clone3, and is thus prone to races.
  See https://github.com/twosigma/set_ns_last_pid
* Cycle through PIDs with fork() until the desired PID is reached:
  This has been demonstrated to work with cycling rates of 100,000 PIDs/s
  See https://github.com/twosigma/set_ns_last_pid
* Patch out the CAP_SYS_ADMIN check from the kernel
* Run the desired application in a new user and PID namespace to provide
  a local CAP_SYS_ADMIN for controlling PIDs. This technique has limited
  use in typical container environments (e.g., Kubernetes) as /proc is
  typically protected with read-only layers (e.g., /proc/sys) for
  hardening purposes. Read-only layers prevent additional /proc mounts
  (due to proc's SB_I_USERNS_VISIBLE property), making the use of new
  PID namespaces limited as certain applications need access to /proc
  matching their PID namespace.

The introduced capability allows to:
* Control PIDs when the current user is CAP_CHECKPOINT_RESTORE capable
  for the corresponding PID namespace via ns_last_pid/clone3.
* Open files in /proc/pid/map_files when the current user is
  CAP_CHECKPOINT_RESTORE capable in the root namespace, useful for
  recovering files that are unreachable via the file system such as
  deleted files, or memfd files.

See corresponding selftest for an example with clone3().

Signed-off-by: Adrian Reber &lt;areber@redhat.com&gt;
Signed-off-by: Nicolas Viennot &lt;Nicolas.Viennot@twosigma.com&gt;
Reviewed-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Link: https://lore.kernel.org/r/20200719100418.2112740-2-areber@redhat.com
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next</title>
<updated>2020-06-03T23:27:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-06-03T23:27:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cb8e59cc87201af93dfbb6c3dccc8fcad72a09c2'/>
<id>urn:sha1:cb8e59cc87201af93dfbb6c3dccc8fcad72a09c2</id>
<content type='text'>
Pull networking updates from David Miller:

 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

 2) Add GSO partial support to igc, from Sasha Neftin.

 3) Several cleanups and improvements to r8169 from Heiner Kallweit.

 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

 5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

 8) Add sriov and vf support to hinic, from Luo bin.

 9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

18) Several RISCV bpf jit optimizations, from Luke Nelson.

19) Correct the return type of the -&gt;ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

21) Add BPF iterators, from Yonghang Song.

22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

25) Add CAP_BPF, from Alexei Starovoitov.

26) Support terse dumps in the packet scheduler, from Vlad Buslov.

27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

28) Add devm_register_netdev(), from Bartosz Golaszewski.

29) Minimize qdisc resets, from Cong Wang.

30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
  selftests: net: ip_defrag: ignore EPERM
  net_failover: fixed rollback in net_failover_open()
  Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
  Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
  vmxnet3: allow rx flow hash ops only when rss is enabled
  hinic: add set_channels ethtool_ops support
  selftests/bpf: Add a default $(CXX) value
  tools/bpf: Don't use $(COMPILE.c)
  bpf, selftests: Use bpf_probe_read_kernel
  s390/bpf: Use bcr 0,%0 as tail call nop filler
  s390/bpf: Maintain 8-byte stack alignment
  selftests/bpf: Fix verifier test
  selftests/bpf: Fix sample_cnt shared between two threads
  bpf, selftests: Adapt cls_redirect to call csum_level helper
  bpf: Add csum_level helper for fixing up csum levels
  bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
  sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
  crypto/chtls: IPv6 support for inline TLS
  Crypto/chcr: Fixes a coccinile check error
  Crypto/chcr: Fixes compilations warnings
  ...
</content>
</entry>
<entry>
<title>Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security</title>
<updated>2020-06-03T00:36:24+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-06-03T00:36:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d9afbb3509900a953f5cf90bc57e793ee80c1108'/>
<id>urn:sha1:d9afbb3509900a953f5cf90bc57e793ee80c1108</id>
<content type='text'>
Pull lockdown update from James Morris:
 "An update for the security subsystem to allow unprivileged users
  to see the status of the lockdown feature. From Jeremy Cline"

Also an added comment to describe CAP_SETFCAP.

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  capabilities: add description for CAP_SETFCAP
  lockdown: Allow unprivileged users to see lockdown status
</content>
</entry>
</feed>
