diff options
Diffstat (limited to 'Documentation/userspace-api')
45 files changed, 2505 insertions, 400 deletions
diff --git a/Documentation/userspace-api/accelerators/ocxl.rst b/Documentation/userspace-api/accelerators/ocxl.rst index db7570d5e50d..4e213af70237 100644 --- a/Documentation/userspace-api/accelerators/ocxl.rst +++ b/Documentation/userspace-api/accelerators/ocxl.rst @@ -3,8 +3,11 @@ OpenCAPI (Open Coherent Accelerator Processor Interface) ======================================================== OpenCAPI is an interface between processors and accelerators. It aims -at being low-latency and high-bandwidth. The specification is -developed by the `OpenCAPI Consortium <http://opencapi.org/>`_. +at being low-latency and high-bandwidth. + +The specification was developed by the OpenCAPI Consortium, and is now +available from the `Compute Express Link Consortium +<https://computeexpresslink.org/resource/opencapi-specification-archive/>`_. It allows an accelerator (which could be an FPGA, ASICs, ...) to access the host memory coherently, using virtual addresses. An OpenCAPI diff --git a/Documentation/userspace-api/check_exec.rst b/Documentation/userspace-api/check_exec.rst new file mode 100644 index 000000000000..05dfe3b56f71 --- /dev/null +++ b/Documentation/userspace-api/check_exec.rst @@ -0,0 +1,144 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. Copyright © 2024 Microsoft Corporation + +=================== +Executability check +=================== + +The ``AT_EXECVE_CHECK`` :manpage:`execveat(2)` flag, and the +``SECBIT_EXEC_RESTRICT_FILE`` and ``SECBIT_EXEC_DENY_INTERACTIVE`` securebits +are intended for script interpreters and dynamic linkers to enforce a +consistent execution security policy handled by the kernel. See the +`samples/check-exec/inc.c`_ example. + +Whether an interpreter should check these securebits or not depends on the +security risk of running malicious scripts with respect to the execution +environment, and whether the kernel can check if a script is trustworthy or +not. For instance, Python scripts running on a server can use arbitrary +syscalls and access arbitrary files. Such interpreters should then be +enlighten to use these securebits and let users define their security policy. +However, a JavaScript engine running in a web browser should already be +sandboxed and then should not be able to harm the user's environment. + +Script interpreters or dynamic linkers built for tailored execution environments +(e.g. hardened Linux distributions or hermetic container images) could use +``AT_EXECVE_CHECK`` without checking the related securebits if backward +compatibility is handled by something else (e.g. atomic update ensuring that +all legitimate libraries are allowed to be executed). It is then recommended +for script interpreters and dynamic linkers to check the securebits at run time +by default, but also to provide the ability for custom builds to behave like if +``SECBIT_EXEC_RESTRICT_FILE`` or ``SECBIT_EXEC_DENY_INTERACTIVE`` were always +set to 1 (i.e. always enforce restrictions). + +AT_EXECVE_CHECK +=============== + +Passing the ``AT_EXECVE_CHECK`` flag to :manpage:`execveat(2)` only performs a +check on a regular file and returns 0 if execution of this file would be +allowed, ignoring the file format and then the related interpreter dependencies +(e.g. ELF libraries, script's shebang). + +Programs should always perform this check to apply kernel-level checks against +files that are not directly executed by the kernel but passed to a user space +interpreter instead. All files that contain executable code, from the point of +view of the interpreter, should be checked. However the result of this check +should only be enforced according to ``SECBIT_EXEC_RESTRICT_FILE`` or +``SECBIT_EXEC_DENY_INTERACTIVE.``. + +The main purpose of this flag is to improve the security and consistency of an +execution environment to ensure that direct file execution (e.g. +``./script.sh``) and indirect file execution (e.g. ``sh script.sh``) lead to +the same result. For instance, this can be used to check if a file is +trustworthy according to the caller's environment. + +In a secure environment, libraries and any executable dependencies should also +be checked. For instance, dynamic linking should make sure that all libraries +are allowed for execution to avoid trivial bypass (e.g. using ``LD_PRELOAD``). +For such secure execution environment to make sense, only trusted code should +be executable, which also requires integrity guarantees. + +To avoid race conditions leading to time-of-check to time-of-use issues, +``AT_EXECVE_CHECK`` should be used with ``AT_EMPTY_PATH`` to check against a +file descriptor instead of a path. + +SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE +========================================================== + +When ``SECBIT_EXEC_RESTRICT_FILE`` is set, a process should only interpret or +execute a file if a call to :manpage:`execveat(2)` with the related file +descriptor and the ``AT_EXECVE_CHECK`` flag succeed. + +This secure bit may be set by user session managers, service managers, +container runtimes, sandboxer tools... Except for test environments, the +related ``SECBIT_EXEC_RESTRICT_FILE_LOCKED`` bit should also be set. + +Programs should only enforce consistent restrictions according to the +securebits but without relying on any other user-controlled configuration. +Indeed, the use case for these securebits is to only trust executable code +vetted by the system configuration (through the kernel), so we should be +careful to not let untrusted users control this configuration. + +However, script interpreters may still use user configuration such as +environment variables as long as it is not a way to disable the securebits +checks. For instance, the ``PATH`` and ``LD_PRELOAD`` variables can be set by +a script's caller. Changing these variables may lead to unintended code +executions, but only from vetted executable programs, which is OK. For this to +make sense, the system should provide a consistent security policy to avoid +arbitrary code execution e.g., by enforcing a write xor execute policy. + +When ``SECBIT_EXEC_DENY_INTERACTIVE`` is set, a process should never interpret +interactive user commands (e.g. scripts). However, if such commands are passed +through a file descriptor (e.g. stdin), its content should be interpreted if a +call to :manpage:`execveat(2)` with the related file descriptor and the +``AT_EXECVE_CHECK`` flag succeed. + +For instance, script interpreters called with a script snippet as argument +should always deny such execution if ``SECBIT_EXEC_DENY_INTERACTIVE`` is set. + +This secure bit may be set by user session managers, service managers, +container runtimes, sandboxer tools... Except for test environments, the +related ``SECBIT_EXEC_DENY_INTERACTIVE_LOCKED`` bit should also be set. + +Here is the expected behavior for a script interpreter according to combination +of any exec securebits: + +1. ``SECBIT_EXEC_RESTRICT_FILE=0`` and ``SECBIT_EXEC_DENY_INTERACTIVE=0`` + + Always interpret scripts, and allow arbitrary user commands (default). + + No threat, everyone and everything is trusted, but we can get ahead of + potential issues thanks to the call to :manpage:`execveat(2)` with + ``AT_EXECVE_CHECK`` which should always be performed but ignored by the + script interpreter. Indeed, this check is still important to enable systems + administrators to verify requests (e.g. with audit) and prepare for + migration to a secure mode. + +2. ``SECBIT_EXEC_RESTRICT_FILE=1`` and ``SECBIT_EXEC_DENY_INTERACTIVE=0`` + + Deny script interpretation if they are not executable, but allow + arbitrary user commands. + + The threat is (potential) malicious scripts run by trusted (and not fooled) + users. That can protect against unintended script executions (e.g. ``sh + /tmp/*.sh``). This makes sense for (semi-restricted) user sessions. + +3. ``SECBIT_EXEC_RESTRICT_FILE=0`` and ``SECBIT_EXEC_DENY_INTERACTIVE=1`` + + Always interpret scripts, but deny arbitrary user commands. + + This use case may be useful for secure services (i.e. without interactive + user session) where scripts' integrity is verified (e.g. with IMA/EVM or + dm-verity/IPE) but where access rights might not be ready yet. Indeed, + arbitrary interactive commands would be much more difficult to check. + +4. ``SECBIT_EXEC_RESTRICT_FILE=1`` and ``SECBIT_EXEC_DENY_INTERACTIVE=1`` + + Deny script interpretation if they are not executable, and also deny + any arbitrary user commands. + + The threat is malicious scripts run by untrusted users (but trusted code). + This makes sense for system services that may only execute trusted scripts. + +.. Links +.. _samples/check-exec/inc.c: + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/check-exec/inc.c diff --git a/Documentation/userspace-api/dma-buf-heaps.rst b/Documentation/userspace-api/dma-buf-heaps.rst new file mode 100644 index 000000000000..1dfe5e7acd5a --- /dev/null +++ b/Documentation/userspace-api/dma-buf-heaps.rst @@ -0,0 +1,28 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================== +Allocating dma-buf using heaps +============================== + +Dma-buf Heaps are a way for userspace to allocate dma-buf objects. They are +typically used to allocate buffers from a specific allocation pool, or to share +buffers across frameworks. + +Heaps +===== + +A heap represents a specific allocator. The Linux kernel currently supports the +following heaps: + + - The ``system`` heap allocates virtually contiguous, cacheable, buffers. + + - The ``cma`` heap allocates physically contiguous, cacheable, + buffers. Only present if a CMA region is present. Such a region is + usually created either through the kernel commandline through the + ``cma`` parameter, a memory region Device-Tree node with the + ``linux,cma-default`` property set, or through the ``CMA_SIZE_MBYTES`` or + ``CMA_SIZE_PERCENTAGE`` Kconfig options. The heap's name in devtmpfs is + ``default_cma_region``. For backwards compatibility, when the + ``DMABUF_HEAPS_CMA_LEGACY`` Kconfig option is set, a duplicate node is + created following legacy naming conventions; the legacy name might be + ``reserved``, ``linux,cma``, or ``default-pool``. diff --git a/Documentation/userspace-api/fwctl/fwctl-cxl.rst b/Documentation/userspace-api/fwctl/fwctl-cxl.rst new file mode 100644 index 000000000000..670b43b72949 --- /dev/null +++ b/Documentation/userspace-api/fwctl/fwctl-cxl.rst @@ -0,0 +1,142 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================ +fwctl cxl driver +================ + +:Author: Dave Jiang + +Overview +======== + +The CXL spec defines a set of commands that can be issued to the mailbox of a +CXL device or switch. It also left room for vendor specific commands to be +issued to the mailbox as well. fwctl provides a path to issue a set of allowed +mailbox commands from user space to the device moderated by the kernel driver. + +The following 3 commands will be used to support CXL Features: +CXL spec r3.1 8.2.9.6.1 Get Supported Features (Opcode 0500h) +CXL spec r3.1 8.2.9.6.2 Get Feature (Opcode 0501h) +CXL spec r3.1 8.2.9.6.3 Set Feature (Opcode 0502h) + +The "Get Supported Features" return data may be filtered by the kernel driver to +drop any features that are forbidden by the kernel or being exclusively used by +the kernel. The driver will set the "Set Feature Size" of the "Get Supported +Features Supported Feature Entry" to 0 to indicate that the Feature cannot be +modified. The "Get Supported Features" command and the "Get Features" falls +under the fwctl policy of FWCTL_RPC_CONFIGURATION. + +For "Set Feature" command, the access policy currently is broken down into two +categories depending on the Set Feature effects reported by the device. If the +Set Feature will cause immediate change to the device, the fwctl access policy +must be FWCTL_RPC_DEBUG_WRITE_FULL. The effects for this level are +"immediate config change", "immediate data change", "immediate policy change", +or "immediate log change" for the set effects mask. If the effects are "config +change with cold reset" or "config change with conventional reset", then the +fwctl access policy must be FWCTL_RPC_DEBUG_WRITE or higher. + +fwctl cxl User API +================== + +.. kernel-doc:: include/uapi/fwctl/cxl.h + +1. Driver info query +-------------------- + +First step for the app is to issue the ioctl(FWCTL_CMD_INFO). Successful +invocation of the ioctl implies the Features capability is operational and +returns an all zeros 32bit payload. A ``struct fwctl_info`` needs to be filled +out with the ``fwctl_info.out_device_type`` set to ``FWCTL_DEVICE_TYPE_CXL``. +The return data should be ``struct fwctl_info_cxl`` that contains a reserved +32bit field that should be all zeros. + +2. Send hardware commands +------------------------- + +Next step is to send the 'Get Supported Features' command to the driver from +user space via ioctl(FWCTL_RPC). A ``struct fwctl_rpc_cxl`` is pointed to +by ``fwctl_rpc.in``. ``struct fwctl_rpc_cxl.in_payload`` points to +the hardware input structure that is defined by the CXL spec. ``fwctl_rpc.out`` +points to the buffer that contains a ``struct fwctl_rpc_cxl_out`` that includes +the hardware output data inlined as ``fwctl_rpc_cxl_out.payload``. This command +is called twice. First time to retrieve the number of features supported. +A second time to retrieve the specific feature details as the output data. + +After getting the specific feature details, a Get/Set Feature command can be +appropriately programmed and sent. For a "Set Feature" command, the retrieved +feature info contains an effects field that details the resulting +"Set Feature" command will trigger. That will inform the user whether +the system is configured to allowed the "Set Feature" command or not. + +Code example of a Get Feature +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: c + + static int cxl_fwctl_rpc_get_test_feature(int fd, struct test_feature *feat_ctx, + const uint32_t expected_data) + { + struct cxl_mbox_get_feat_in *feat_in; + struct fwctl_rpc_cxl_out *out; + struct fwctl_rpc rpc = {0}; + struct fwctl_rpc_cxl *in; + size_t out_size, in_size; + uint32_t val; + void *data; + int rc; + + in_size = sizeof(*in) + sizeof(*feat_in); + rc = posix_memalign((void **)&in, 16, in_size); + if (rc) + return -ENOMEM; + memset(in, 0, in_size); + feat_in = &in->get_feat_in; + + uuid_copy(feat_in->uuid, feat_ctx->uuid); + feat_in->count = feat_ctx->get_size; + + out_size = sizeof(*out) + feat_ctx->get_size; + rc = posix_memalign((void **)&out, 16, out_size); + if (rc) + goto free_in; + memset(out, 0, out_size); + + in->opcode = CXL_MBOX_OPCODE_GET_FEATURE; + in->op_size = sizeof(*feat_in); + + rpc.size = sizeof(rpc); + rpc.scope = FWCTL_RPC_CONFIGURATION; + rpc.in_len = in_size; + rpc.out_len = out_size; + rpc.in = (uint64_t)(uint64_t *)in; + rpc.out = (uint64_t)(uint64_t *)out; + + rc = send_command(fd, &rpc, out); + if (rc) + goto free_all; + + data = out->payload; + val = le32toh(*(__le32 *)data); + if (memcmp(&val, &expected_data, sizeof(val)) != 0) { + rc = -ENXIO; + goto free_all; + } + + free_all: + free(out); + free_in: + free(in); + return rc; + } + +Take a look at CXL CLI test directory +<https://github.com/pmem/ndctl/tree/main/test/fwctl.c> for a detailed user code +for examples on how to exercise this path. + + +fwctl cxl Kernel API +==================== + +.. kernel-doc:: drivers/cxl/core/features.c + :export: +.. kernel-doc:: include/cxl/features.h diff --git a/Documentation/userspace-api/fwctl/fwctl.rst b/Documentation/userspace-api/fwctl/fwctl.rst new file mode 100644 index 000000000000..a74eab8d14c6 --- /dev/null +++ b/Documentation/userspace-api/fwctl/fwctl.rst @@ -0,0 +1,286 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=============== +fwctl subsystem +=============== + +:Author: Jason Gunthorpe + +Overview +======== + +Modern devices contain extensive amounts of FW, and in many cases, are largely +software-defined pieces of hardware. The evolution of this approach is largely a +reaction to Moore's Law where a chip tape out is now highly expensive, and the +chip design is extremely large. Replacing fixed HW logic with a flexible and +tightly coupled FW/HW combination is an effective risk mitigation against chip +respin. Problems in the HW design can be counteracted in device FW. This is +especially true for devices which present a stable and backwards compatible +interface to the operating system driver (such as NVMe). + +The FW layer in devices has grown to incredible size and devices frequently +integrate clusters of fast processors to run it. For example, mlx5 devices have +over 30MB of FW code, and big configurations operate with over 1GB of FW managed +runtime state. + +The availability of such a flexible layer has created quite a variety in the +industry where single pieces of silicon are now configurable software-defined +devices and can operate in substantially different ways depending on the need. +Further, we often see cases where specific sites wish to operate devices in ways +that are highly specialized and require applications that have been tailored to +their unique configuration. + +Further, devices have become multi-functional and integrated to the point they +no longer fit neatly into the kernel's division of subsystems. Modern +multi-functional devices have drivers, such as bnxt/ice/mlx5/pds, that span many +subsystems while sharing the underlying hardware using the auxiliary device +system. + +All together this creates a challenge for the operating system, where devices +have an expansive FW environment that needs robust device-specific debugging +support, and FW-driven functionality that is not well suited to “generic” +interfaces. fwctl seeks to allow access to the full device functionality from +user space in the areas of debuggability, management, and first-boot/nth-boot +provisioning. + +fwctl is aimed at the common device design pattern where the OS and FW +communicate via an RPC message layer constructed with a queue or mailbox scheme. +In this case the driver will typically have some layer to deliver RPC messages +and collect RPC responses from device FW. The in-kernel subsystem drivers that +operate the device for its primary purposes will use these RPCs to build their +drivers, but devices also usually have a set of ancillary RPCs that don't really +fit into any specific subsystem. For example, a HW RAID controller is primarily +operated by the block layer but also comes with a set of RPCs to administer the +construction of drives within the HW RAID. + +In the past when devices were more single function, individual subsystems would +grow different approaches to solving some of these common problems. For instance, +monitoring device health, manipulating its FLASH, debugging the FW, +provisioning, all have various unique interfaces across the kernel. + +fwctl's purpose is to define a common set of limited rules, described below, +that allow user space to securely construct and execute RPCs inside device FW. +The rules serve as an agreement between the operating system and FW on how to +correctly design the RPC interface. As a uAPI the subsystem provides a thin +layer of discovery and a generic uAPI to deliver the RPCs and collect the +response. It supports a system of user space libraries and tools which will +use this interface to control the device using the device native protocols. + +Scope of Action +--------------- + +fwctl drivers are strictly restricted to being a way to operate the device FW. +It is not an avenue to access random kernel internals, or other operating system +SW states. + +fwctl instances must operate on a well-defined device function, and the device +should have a well-defined security model for what scope within the physical +device the function is permitted to access. For instance, the most complex PCIe +device today may broadly have several function-level scopes: + + 1. A privileged function with full access to the on-device global state and + configuration + + 2. Multiple hypervisor functions with control over itself and child functions + used with VMs + + 3. Multiple VM functions tightly scoped within the VM + +The device may create a logical parent/child relationship between these scopes. +For instance, a child VM's FW may be within the scope of the hypervisor FW. It is +quite common in the VFIO world that the hypervisor environment has a complex +provisioning/profiling/configuration responsibility for the function VFIO +assigns to the VM. + +Further, within the function, devices often have RPC commands that fall within +some general scopes of action (see enum fwctl_rpc_scope): + + 1. Access to function & child configuration, FLASH, etc. that becomes live at a + function reset. Access to function & child runtime configuration that is + transparent or non-disruptive to any driver or VM. + + 2. Read-only access to function debug information that may report on FW objects + in the function & child, including FW objects owned by other kernel + subsystems. + + 3. Write access to function & child debug information strictly compatible with + the principles of kernel lockdown and kernel integrity protection. Triggers + a kernel taint. + + 4. Full debug device access. Triggers a kernel taint, requires CAP_SYS_RAWIO. + +User space will provide a scope label on each RPC and the kernel must enforce the +above CAPs and taints based on that scope. A combination of kernel and FW can +enforce that RPCs are placed in the correct scope by user space. + +Disallowed behavior +------------------- + +There are many things this interface must not allow user space to do (without a +taint or CAP), broadly derived from the principles of kernel lockdown. Some +examples: + + 1. DMA to/from arbitrary memory, hang the system, compromise FW integrity with + untrusted code, or otherwise compromise device or system security and + integrity. + + 2. Provide an abnormal “back door” to kernel drivers. No manipulation of kernel + objects owned by kernel drivers. + + 3. Directly configure or otherwise control kernel drivers. A subsystem kernel + driver can react to the device configuration at function reset/driver load + time, but otherwise must not be coupled to fwctl. + + 4. Operate the HW in a way that overlaps with the core purpose of another + primary kernel subsystem, such as read/write to LBAs, send/receive of + network packets, or operate an accelerator's data plane. + +fwctl is not a replacement for device direct access subsystems like uacce or +VFIO. + +Operations exposed through fwctl's non-tainting interfaces should be fully +sharable with other users of the device. For instance, exposing a RPC through +fwctl should never prevent a kernel subsystem from also concurrently using that +same RPC or hardware unit down the road. In such cases fwctl will be less +important than proper kernel subsystems that eventually emerge. Mistakes in this +area resulting in clashes will be resolved in favour of a kernel implementation. + +fwctl User API +============== + +.. kernel-doc:: include/uapi/fwctl/fwctl.h +.. kernel-doc:: include/uapi/fwctl/mlx5.h +.. kernel-doc:: include/uapi/fwctl/pds.h + +sysfs Class +----------- + +fwctl has a sysfs class (/sys/class/fwctl/fwctlNN/) and character devices +(/dev/fwctl/fwctlNN) with a simple numbered scheme. The character device +operates the iotcl uAPI described above. + +fwctl devices can be related to driver components in other subsystems through +sysfs:: + + $ ls /sys/class/fwctl/fwctl0/device/infiniband/ + ibp0s10f0 + + $ ls /sys/class/infiniband/ibp0s10f0/device/fwctl/ + fwctl0/ + + $ ls /sys/devices/pci0000:00/0000:00:0a.0/fwctl/fwctl0 + dev device power subsystem uevent + +User space Community +-------------------- + +Drawing inspiration from nvme-cli, participating in the kernel side must come +with a user space in a common TBD git tree, at a minimum to usefully operate the +kernel driver. Providing such an implementation is a pre-condition to merging a +kernel driver. + +The goal is to build user space community around some of the shared problems +we all have, and ideally develop some common user space programs with some +starting themes of: + + - Device in-field debugging + + - HW provisioning + + - VFIO child device profiling before VM boot + + - Confidential Compute topics (attestation, secure provisioning) + +that stretch across all subsystems in the kernel. fwupd is a great example of +how an excellent user space experience can emerge out of kernel-side diversity. + +fwctl Kernel API +================ + +.. kernel-doc:: drivers/fwctl/main.c + :export: +.. kernel-doc:: include/linux/fwctl.h + +fwctl Driver design +------------------- + +In many cases a fwctl driver is going to be part of a larger cross-subsystem +device possibly using the auxiliary_device mechanism. In that case several +subsystems are going to be sharing the same device and FW interface layer so the +device design must already provide for isolation and cooperation between kernel +subsystems. fwctl should fit into that same model. + +Part of the driver should include a description of how its scope restrictions +and security model work. The driver and FW together must ensure that RPCs +provided by user space are mapped to the appropriate scope. If the validation is +done in the driver then the validation can read a 'command effects' report from +the device, or hardwire the enforcement. If the validation is done in the FW, +then the driver should pass the fwctl_rpc_scope to the FW along with the command. + +The driver and FW must cooperate to ensure that either fwctl cannot allocate +any FW resources, or any resources it does allocate are freed on FD closure. A +driver primarily constructed around FW RPCs may find that its core PCI function +and RPC layer belongs under fwctl with auxiliary devices connecting to other +subsystems. + +Each device type must be mindful of Linux's philosophy for stable ABI. The FW +RPC interface does not have to meet a strictly stable ABI, but it does need to +meet an expectation that user space tools that are deployed and in significant +use don't needlessly break. FW upgrade and kernel upgrade should keep widely +deployed tooling working. + +Development and debugging focused RPCs under more permissive scopes can have +less stability if the tools using them are only run under exceptional +circumstances and not for every day use of the device. Debugging tools may even +require exact version matching as they may require something similar to DWARF +debug information from the FW binary. + +Security Response +================= + +The kernel remains the gatekeeper for this interface. If violations of the +scopes, security or isolation principles are found, we have options to let +devices fix them with a FW update, push a kernel patch to parse and block RPC +commands or push a kernel patch to block entire firmware versions/devices. + +While the kernel can always directly parse and restrict RPCs, it is expected +that the existing kernel pattern of allowing drivers to delegate validation to +FW to be a useful design. + +Existing Similar Examples +========================= + +The approach described in this document is not a new idea. Direct, or near +direct device access has been offered by the kernel in different areas for +decades. With more devices wanting to follow this design pattern it is becoming +clear that it is not entirely well understood and, more importantly, the +security considerations are not well defined or agreed upon. + +Some examples: + + - HW RAID controllers. This includes RPCs to do things like compose drives into + a RAID volume, configure RAID parameters, monitor the HW and more. + + - Baseboard managers. RPCs for configuring settings in the device and more. + + - NVMe vendor command capsules. nvme-cli provides access to some monitoring + functions that different products have defined, but more exist. + + - CXL also has a NVMe-like vendor command system. + + - DRM allows user space drivers to send commands to the device via kernel + mediation. + + - RDMA allows user space drivers to directly push commands to the device + without kernel involvement. + + - Various “raw” APIs, raw HID (SDL2), raw USB, NVMe Generic Interface, etc. + +The first 4 are examples of areas that fwctl intends to cover. The latter three +are examples of disallowed behavior as they fully overlap with the primary purpose +of a kernel subsystem. + +Some key lessons learned from these past efforts are the importance of having a +common user space project to use as a pre-condition for obtaining a kernel +driver. Developing good community around useful software in user space is key to +getting companies to fund participation to enable their products. diff --git a/Documentation/userspace-api/fwctl/index.rst b/Documentation/userspace-api/fwctl/index.rst new file mode 100644 index 000000000000..316ac456ad3b --- /dev/null +++ b/Documentation/userspace-api/fwctl/index.rst @@ -0,0 +1,14 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Firmware Control (FWCTL) Userspace API +====================================== + +A framework that define a common set of limited rules that allows user space +to securely construct and execute RPCs inside device firmware. + +.. toctree:: + :maxdepth: 1 + + fwctl + fwctl-cxl + pds_fwctl diff --git a/Documentation/userspace-api/fwctl/pds_fwctl.rst b/Documentation/userspace-api/fwctl/pds_fwctl.rst new file mode 100644 index 000000000000..b5a31f82c883 --- /dev/null +++ b/Documentation/userspace-api/fwctl/pds_fwctl.rst @@ -0,0 +1,46 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================ +fwctl pds driver +================ + +:Author: Shannon Nelson + +Overview +======== + +The PDS Core device makes a fwctl service available through an +auxiliary_device named pds_core.fwctl.N. The pds_fwctl driver binds to +this device and registers itself with the fwctl subsystem. The resulting +userspace interface is used by an application that is a part of the +AMD Pensando software package for the Distributed Service Card (DSC). + +The pds_fwctl driver has little knowledge of the firmware's internals. +It only knows how to send commands through pds_core's message queue to the +firmware for fwctl requests. The set of fwctl operations available +depends on the firmware in the DSC, and the userspace application +version must match the firmware so that they can talk to each other. + +When a connection is created the pds_fwctl driver requests from the +firmware a list of firmware object endpoints, and for each endpoint the +driver requests a list of operations for that endpoint. + +Each operation description includes a firmware defined command attribute +that maps to the FWCTL scope levels. The driver translates those firmware +values into the FWCTL scope values which can then be used for filtering the +scoped user requests. + +pds_fwctl User API +================== + +Each RPC request includes the target endpoint and the operation id, and in +and out buffer lengths and pointers. The driver verifies the existence +of the requested endpoint and operations, then checks the request scope +against the required scope of the operation. The request is then put +together with the request data and sent through pds_core's message queue +to the firmware, and the results are returned to the caller. + +The RPC endpoints, operations, and buffer contents are defined by the +particular firmware package in the device, which varies across the +available product configurations. The details are available in the +specific product SDK documentation. diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst index 274cc7546efc..b8c73be4fb11 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -35,6 +35,7 @@ Security-related interfaces mfd_noexec spec_ctrl tee + check_exec Devices and I/O =============== @@ -43,7 +44,9 @@ Devices and I/O :maxdepth: 1 accelerators/ocxl + dma-buf-heaps dma-buf-alloc-exchange + fwctl/index gpio/index iommufd media/index @@ -63,6 +66,7 @@ Everything else vduse futex2 perf_ring_buffer + ntsync .. only:: subproject and html diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index e4be1378ba26..406a9f4d0869 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -10,12 +10,14 @@ Michael Elizabeth Chastain If you are adding new ioctl's to the kernel, you should use the _IO macros defined in <linux/ioctl.h>: - ====== == ============================================ - _IO an ioctl with no parameters - _IOW an ioctl with write parameters (copy_from_user) - _IOR an ioctl with read parameters (copy_to_user) - _IOWR an ioctl with both write and read parameters. - ====== == ============================================ + ====== =========================== + macro parameters + ====== =========================== + _IO none + _IOW write (read from userspace) + _IOR read (write to userpace) + _IOWR write and read + ====== =========================== 'Write' and 'read' are from the user's point of view, just like the system calls 'write' and 'read'. For example, a SET_FOO ioctl would @@ -23,22 +25,24 @@ be _IOW, although the kernel would actually read data from user space; a GET_FOO ioctl would be _IOR, although the kernel would actually write data to user space. -The first argument to _IO, _IOW, _IOR, or _IOWR is an identifying letter -or number from the table below. Because of the large number of drivers, -many drivers share a partial letter with other drivers. +The first argument to the macros is an identifying letter or number from +the table below. Because of the large number of drivers, many drivers +share a partial letter with other drivers. If you are writing a driver for a new device and need a letter, pick an -unused block with enough room for expansion: 32 to 256 ioctl commands. -You can register the block by patching this file and submitting the -patch to Linus Torvalds. Or you can e-mail me at <mec@shout.net> and -I'll register one for you. +unused block with enough room for expansion: 32 to 256 ioctl commands +should suffice. You can register the block by patching this file and +submitting the patch through :doc:`usual patch submission process +</process/submitting-patches>`. -The second argument to _IO, _IOW, _IOR, or _IOWR is a sequence number -to distinguish ioctls from each other. The third argument to _IOW, -_IOR, or _IOWR is the type of the data going into the kernel or coming -out of the kernel (e.g. 'int' or 'struct foo'). NOTE! Do NOT use -sizeof(arg) as the third argument as this results in your ioctl thinking -it passes an argument of type size_t. +The second argument is a sequence number to distinguish ioctls from each +other. The third argument (not applicable to _IO) is the type of the data +going into the kernel or coming out of the kernel (e.g. 'int' or +'struct foo'). + +.. note:: + Do NOT use sizeof(arg) as the third argument as this results in your + ioctl thinking it passes an argument of type size_t. Some devices use their major number as the identifier; this is OK, as long as it is unique. Some devices are irregular and don't follow any @@ -51,7 +55,7 @@ Following this convention is good because: error rather than some unexpected behaviour. (2) The 'strace' build procedure automatically finds ioctl numbers - defined with _IO, _IOW, _IOR, or _IOWR. + defined with the macros. (3) 'strace' can decode numbers back into useful names when the numbers are unique. @@ -62,334 +66,347 @@ Following this convention is good because: (5) When following the convention, the driver code can use generic code to copy the parameters between user and kernel space. -This table lists ioctls visible from user land for Linux/x86. It contains -most drivers up to 2.6.31, but I know I am missing some. There has been -no attempt to list non-X86 architectures or ioctls from drivers/staging/. +This table lists ioctls visible from userland, excluding ones from +drivers/staging/. -==== ===== ======================================================= ================================================================ -Code Seq# Include File Comments +==== ===== ========================================================= ================================================================ +Code Seq# Include File Comments (hex) -==== ===== ======================================================= ================================================================ -0x00 00-1F linux/fs.h conflict! -0x00 00-1F scsi/scsi_ioctl.h conflict! -0x00 00-1F linux/fb.h conflict! -0x00 00-1F linux/wavefront.h conflict! +==== ===== ========================================================= ================================================================ +0x00 00-1F linux/fs.h conflict! +0x00 00-1F scsi/scsi_ioctl.h conflict! +0x00 00-1F linux/fb.h conflict! +0x00 00-1F linux/wavefront.h conflict! 0x02 all linux/fd.h 0x03 all linux/hdreg.h -0x04 D2-DC linux/umsdos_fs.h Dead since 2.6.11, but don't reuse these. +0x04 D2-DC linux/umsdos_fs.h Dead since 2.6.11, but don't reuse these. 0x06 all linux/lp.h 0x07 9F-D0 linux/vmw_vmci_defs.h, uapi/linux/vm_sockets.h 0x09 all linux/raid/md_u.h 0x10 00-0F drivers/char/s390/vmcp.h 0x10 10-1F arch/s390/include/uapi/sclp_ctl.h 0x10 20-2F arch/s390/include/uapi/asm/hypfs.h -0x12 all linux/fs.h BLK* ioctls +0x12 all linux/fs.h BLK* ioctls linux/blkpg.h -0x15 all linux/fs.h FS_IOC_* ioctls -0x1b all InfiniBand Subsystem - <http://infiniband.sourceforge.net/> + linux/blkzoned.h + linux/blk-crypto.h +0x15 all linux/fs.h FS_IOC_* ioctls +0x1b all InfiniBand Subsystem + <http://infiniband.sourceforge.net/> 0x20 all drivers/cdrom/cm206.h 0x22 all scsi/sg.h -0x3E 00-0F linux/counter.h <mailto:linux-iio@vger.kernel.org> +0x3E 00-0F linux/counter.h <mailto:linux-iio@vger.kernel.org> '!' 00-1F uapi/linux/seccomp.h -'#' 00-3F IEEE 1394 Subsystem - Block for the entire subsystem +'#' 00-3F IEEE 1394 Subsystem + Block for the entire subsystem '$' 00-0F linux/perf_counter.h, linux/perf_event.h -'%' 00-0F include/uapi/linux/stm.h System Trace Module subsystem - <mailto:alexander.shishkin@linux.intel.com> +'%' 00-0F include/uapi/linux/stm.h System Trace Module subsystem + <mailto:alexander.shishkin@linux.intel.com> '&' 00-07 drivers/firewire/nosy-user.h -'*' 00-1F uapi/linux/user_events.h User Events Subsystem - <mailto:linux-trace-kernel@vger.kernel.org> -'1' 00-1F linux/timepps.h PPS kit from Ulrich Windl - <ftp://ftp.de.kernel.org/pub/linux/daemons/ntp/PPS/> +'*' 00-1F uapi/linux/user_events.h User Events Subsystem + <mailto:linux-trace-kernel@vger.kernel.org> +'1' 00-1F linux/timepps.h PPS kit from Ulrich Windl + <ftp://ftp.de.kernel.org/pub/linux/daemons/ntp/PPS/> '2' 01-04 linux/i2o.h -'3' 00-0F drivers/s390/char/raw3270.h conflict! -'3' 00-1F linux/suspend_ioctls.h, conflict! +'3' 00-0F drivers/s390/char/raw3270.h conflict! +'3' 00-1F linux/suspend_ioctls.h, conflict! kernel/power/user.c -'8' all SNP8023 advanced NIC card - <mailto:mcr@solidum.com> +'8' all SNP8023 advanced NIC card + <mailto:mcr@solidum.com> ';' 64-7F linux/vfio.h ';' 80-FF linux/iommufd.h -'=' 00-3f uapi/linux/ptp_clock.h <mailto:richardcochran@gmail.com> -'@' 00-0F linux/radeonfb.h conflict! -'@' 00-0F drivers/video/aty/aty128fb.c conflict! -'A' 00-1F linux/apm_bios.h conflict! -'A' 00-0F linux/agpgart.h, conflict! +'=' 00-3f uapi/linux/ptp_clock.h <mailto:richardcochran@gmail.com> +'@' 00-0F linux/radeonfb.h conflict! +'@' 00-0F drivers/video/aty/aty128fb.c conflict! +'A' 00-1F linux/apm_bios.h conflict! +'A' 00-0F linux/agpgart.h, conflict! drivers/char/agp/compat_ioctl.h -'A' 00-7F sound/asound.h conflict! -'B' 00-1F linux/cciss_ioctl.h conflict! -'B' 00-0F include/linux/pmu.h conflict! -'B' C0-FF advanced bbus <mailto:maassen@uni-freiburg.de> -'B' 00-0F xen/xenbus_dev.h conflict! -'C' all linux/soundcard.h conflict! -'C' 01-2F linux/capi.h conflict! -'C' F0-FF drivers/net/wan/cosa.h conflict! +'A' 00-7F sound/asound.h conflict! +'B' 00-1F linux/cciss_ioctl.h conflict! +'B' 00-0F include/linux/pmu.h conflict! +'B' C0-FF advanced bbus <mailto:maassen@uni-freiburg.de> +'B' 00-0F xen/xenbus_dev.h conflict! +'C' all linux/soundcard.h conflict! +'C' 01-2F linux/capi.h conflict! +'C' F0-FF drivers/net/wan/cosa.h conflict! 'D' all arch/s390/include/asm/dasd.h -'D' 40-5F drivers/scsi/dpt/dtpi_ioctl.h Dead since 2022 +'D' 40-5F drivers/scsi/dpt/dtpi_ioctl.h Dead since 2022 'D' 05 drivers/scsi/pmcraid.h -'E' all linux/input.h conflict! -'E' 00-0F xen/evtchn.h conflict! -'F' all linux/fb.h conflict! -'F' 01-02 drivers/scsi/pmcraid.h conflict! -'F' 20 drivers/video/fsl-diu-fb.h conflict! -'F' 20 linux/ivtvfb.h conflict! -'F' 20 linux/matroxfb.h conflict! -'F' 20 drivers/video/aty/atyfb_base.c conflict! -'F' 00-0F video/da8xx-fb.h conflict! -'F' 80-8F linux/arcfb.h conflict! -'F' DD video/sstfb.h conflict! -'G' 00-3F drivers/misc/sgi-gru/grulib.h conflict! -'G' 00-0F xen/gntalloc.h, xen/gntdev.h conflict! -'H' 00-7F linux/hiddev.h conflict! -'H' 00-0F linux/hidraw.h conflict! -'H' 01 linux/mei.h conflict! -'H' 02 linux/mei.h conflict! -'H' 03 linux/mei.h conflict! -'H' 00-0F sound/asound.h conflict! -'H' 20-40 sound/asound_fm.h conflict! -'H' 80-8F sound/sfnt_info.h conflict! -'H' 10-8F sound/emu10k1.h conflict! -'H' 10-1F sound/sb16_csp.h conflict! -'H' 10-1F sound/hda_hwdep.h conflict! -'H' 40-4F sound/hdspm.h conflict! -'H' 40-4F sound/hdsp.h conflict! +'E' all linux/input.h conflict! +'E' 00-0F xen/evtchn.h conflict! +'F' all linux/fb.h conflict! +'F' 01-02 drivers/scsi/pmcraid.h conflict! +'F' 20 drivers/video/fsl-diu-fb.h conflict! +'F' 20 linux/ivtvfb.h conflict! +'F' 20 linux/matroxfb.h conflict! +'F' 20 drivers/video/aty/atyfb_base.c conflict! +'F' 00-0F video/da8xx-fb.h conflict! +'F' 80-8F linux/arcfb.h conflict! +'F' DD video/sstfb.h conflict! +'G' 00-3F drivers/misc/sgi-gru/grulib.h conflict! +'G' 00-0F xen/gntalloc.h, xen/gntdev.h conflict! +'H' 00-7F linux/hiddev.h conflict! +'H' 00-0F linux/hidraw.h conflict! +'H' 01 linux/mei.h conflict! +'H' 02 linux/mei.h conflict! +'H' 03 linux/mei.h conflict! +'H' 00-0F sound/asound.h conflict! +'H' 20-40 sound/asound_fm.h conflict! +'H' 80-8F sound/sfnt_info.h conflict! +'H' 10-8F sound/emu10k1.h conflict! +'H' 10-1F sound/sb16_csp.h conflict! +'H' 10-1F sound/hda_hwdep.h conflict! +'H' 40-4F sound/hdspm.h conflict! +'H' 40-4F sound/hdsp.h conflict! 'H' 90 sound/usb/usx2y/usb_stream.h -'H' 00-0F uapi/misc/habanalabs.h conflict! +'H' 00-0F uapi/misc/habanalabs.h conflict! 'H' A0 uapi/linux/usb/cdc-wdm.h -'H' C0-F0 net/bluetooth/hci.h conflict! -'H' C0-DF net/bluetooth/hidp/hidp.h conflict! -'H' C0-DF net/bluetooth/cmtp/cmtp.h conflict! -'H' C0-DF net/bluetooth/bnep/bnep.h conflict! -'H' F1 linux/hid-roccat.h <mailto:erazor_de@users.sourceforge.net> +'H' C0-F0 net/bluetooth/hci.h conflict! +'H' C0-DF net/bluetooth/hidp/hidp.h conflict! +'H' C0-DF net/bluetooth/cmtp/cmtp.h conflict! +'H' C0-DF net/bluetooth/bnep/bnep.h conflict! +'H' F1 linux/hid-roccat.h <mailto:erazor_de@users.sourceforge.net> 'H' F8-FA sound/firewire.h -'I' all linux/isdn.h conflict! -'I' 00-0F drivers/isdn/divert/isdn_divert.h conflict! -'I' 40-4F linux/mISDNif.h conflict! +'I' all linux/isdn.h conflict! +'I' 00-0F drivers/isdn/divert/isdn_divert.h conflict! +'I' 40-4F linux/mISDNif.h conflict! 'K' all linux/kd.h -'L' 00-1F linux/loop.h conflict! -'L' 10-1F drivers/scsi/mpt3sas/mpt3sas_ctl.h conflict! -'L' E0-FF linux/ppdd.h encrypted disk device driver - <http://linux01.gwdg.de/~alatham/ppdd.html> -'M' all linux/soundcard.h conflict! -'M' 01-16 mtd/mtd-abi.h conflict! +'L' 00-1F linux/loop.h conflict! +'L' 10-1F drivers/scsi/mpt3sas/mpt3sas_ctl.h conflict! +'L' E0-FF linux/ppdd.h encrypted disk device driver + <http://linux01.gwdg.de/~alatham/ppdd.html> +'M' all linux/soundcard.h conflict! +'M' 01-16 mtd/mtd-abi.h conflict! and drivers/mtd/mtdchar.c 'M' 01-03 drivers/scsi/megaraid/megaraid_sas.h -'M' 00-0F drivers/video/fsl-diu-fb.h conflict! +'M' 00-0F drivers/video/fsl-diu-fb.h conflict! 'N' 00-1F drivers/usb/scanner.h 'N' 40-7F drivers/block/nvme.c -'N' 80-8F uapi/linux/ntsync.h NT synchronization primitives - <mailto:wine-devel@winehq.org> -'O' 00-06 mtd/ubi-user.h UBI -'P' all linux/soundcard.h conflict! -'P' 60-6F sound/sscape_ioctl.h conflict! -'P' 00-0F drivers/usb/class/usblp.c conflict! -'P' 01-09 drivers/misc/pci_endpoint_test.c conflict! -'P' 00-0F xen/privcmd.h conflict! -'P' 00-05 linux/tps6594_pfsm.h conflict! +'N' 80-8F uapi/linux/ntsync.h NT synchronization primitives + <mailto:wine-devel@winehq.org> +'O' 00-06 mtd/ubi-user.h UBI +'P' all linux/soundcard.h conflict! +'P' 60-6F sound/sscape_ioctl.h conflict! +'P' 00-0F drivers/usb/class/usblp.c conflict! +'P' 01-09 drivers/misc/pci_endpoint_test.c conflict! +'P' 00-0F xen/privcmd.h conflict! +'P' 00-05 linux/tps6594_pfsm.h conflict! 'Q' all linux/soundcard.h -'R' 00-1F linux/random.h conflict! -'R' 01 linux/rfkill.h conflict! +'R' 00-1F linux/random.h conflict! +'R' 01 linux/rfkill.h conflict! 'R' 20-2F linux/trace_mmap.h 'R' C0-DF net/bluetooth/rfcomm.h 'R' E0 uapi/linux/fsl_mc.h -'S' all linux/cdrom.h conflict! -'S' 80-81 scsi/scsi_ioctl.h conflict! -'S' 82-FF scsi/scsi.h conflict! -'S' 00-7F sound/asequencer.h conflict! -'T' all linux/soundcard.h conflict! -'T' 00-AF sound/asound.h conflict! -'T' all arch/x86/include/asm/ioctls.h conflict! -'T' C0-DF linux/if_tun.h conflict! -'U' all sound/asound.h conflict! -'U' 00-CF linux/uinput.h conflict! +'S' all linux/cdrom.h conflict! +'S' 80-81 scsi/scsi_ioctl.h conflict! +'S' 82-FF scsi/scsi.h conflict! +'S' 00-7F sound/asequencer.h conflict! +'T' all linux/soundcard.h conflict! +'T' 00-AF sound/asound.h conflict! +'T' all arch/x86/include/asm/ioctls.h conflict! +'T' C0-DF linux/if_tun.h conflict! +'U' all sound/asound.h conflict! +'U' 00-CF linux/uinput.h conflict! 'U' 00-EF linux/usbdevice_fs.h 'U' C0-CF drivers/bluetooth/hci_uart.h -'V' all linux/vt.h conflict! -'V' all linux/videodev2.h conflict! -'V' C0 linux/ivtvfb.h conflict! -'V' C0 linux/ivtv.h conflict! -'V' C0 media/si4713.h conflict! -'W' 00-1F linux/watchdog.h conflict! -'W' 00-1F linux/wanrouter.h conflict! (pre 3.9) -'W' 00-3F sound/asound.h conflict! +'V' all linux/vt.h conflict! +'V' all linux/videodev2.h conflict! +'V' C0 linux/ivtvfb.h conflict! +'V' C0 linux/ivtv.h conflict! +'V' C0 media/si4713.h conflict! +'W' 00-1F linux/watchdog.h conflict! +'W' 00-1F linux/wanrouter.h conflict! (pre 3.9) +'W' 00-3F sound/asound.h conflict! 'W' 40-5F drivers/pci/switch/switchtec.c 'W' 60-61 linux/watch_queue.h -'X' all fs/xfs/xfs_fs.h, conflict! +'X' all fs/xfs/xfs_fs.h, conflict! fs/xfs/linux-2.6/xfs_ioctl32.h, include/linux/falloc.h, linux/fs.h, -'X' all fs/ocfs2/ocfs_fs.h conflict! -'X' 01 linux/pktcdvd.h conflict! +'X' all fs/ocfs2/ocfs_fs.h conflict! 'Z' 14-15 drivers/message/fusion/mptctl.h -'[' 00-3F linux/usb/tmc.h USB Test and Measurement Devices - <mailto:gregkh@linuxfoundation.org> -'a' all linux/atm*.h, linux/sonet.h ATM on linux - <http://lrcwww.epfl.ch/> -'a' 00-0F drivers/crypto/qat/qat_common/adf_cfg_common.h conflict! qat driver -'b' 00-FF conflict! bit3 vme host bridge - <mailto:natalia@nikhefk.nikhef.nl> -'b' 00-0F linux/dma-buf.h conflict! -'c' 00-7F linux/comstats.h conflict! -'c' 00-7F linux/coda.h conflict! -'c' 00-1F linux/chio.h conflict! -'c' 80-9F arch/s390/include/asm/chsc.h conflict! +'[' 00-3F linux/usb/tmc.h USB Test and Measurement Devices + <mailto:gregkh@linuxfoundation.org> +'a' all linux/atm*.h, linux/sonet.h ATM on linux + <http://lrcwww.epfl.ch/> +'a' 00-0F drivers/crypto/qat/qat_common/adf_cfg_common.h conflict! qat driver +'b' 00-FF conflict! bit3 vme host bridge + <mailto:natalia@nikhefk.nikhef.nl> +'b' 00-0F linux/dma-buf.h conflict! +'c' 00-7F linux/comstats.h conflict! +'c' 00-7F linux/coda.h conflict! +'c' 00-1F linux/chio.h conflict! +'c' 80-9F arch/s390/include/asm/chsc.h conflict! 'c' A0-AF arch/x86/include/asm/msr.h conflict! -'d' 00-FF linux/char/drm/drm.h conflict! -'d' 02-40 pcmcia/ds.h conflict! +'d' 00-FF linux/char/drm/drm.h conflict! +'d' 02-40 pcmcia/ds.h conflict! 'd' F0-FF linux/digi1.h -'e' all linux/digi1.h conflict! -'f' 00-1F linux/ext2_fs.h conflict! -'f' 00-1F linux/ext3_fs.h conflict! -'f' 00-0F fs/jfs/jfs_dinode.h conflict! -'f' 00-0F fs/ext4/ext4.h conflict! -'f' 00-0F linux/fs.h conflict! -'f' 00-0F fs/ocfs2/ocfs2_fs.h conflict! +'e' all linux/digi1.h conflict! +'f' 00-1F linux/ext2_fs.h conflict! +'f' 00-1F linux/ext3_fs.h conflict! +'f' 00-0F fs/jfs/jfs_dinode.h conflict! +'f' 00-0F fs/ext4/ext4.h conflict! +'f' 00-0F linux/fs.h conflict! +'f' 00-0F fs/ocfs2/ocfs2_fs.h conflict! 'f' 13-27 linux/fscrypt.h 'f' 81-8F linux/fsverity.h 'g' 00-0F linux/usb/gadgetfs.h 'g' 20-2F linux/usb/g_printer.h -'h' 00-7F conflict! Charon filesystem - <mailto:zapman@interlan.net> -'h' 00-1F linux/hpet.h conflict! +'h' 00-7F conflict! Charon filesystem + <mailto:zapman@interlan.net> +'h' 00-1F linux/hpet.h conflict! 'h' 80-8F fs/hfsplus/ioctl.c -'i' 00-3F linux/i2o-dev.h conflict! -'i' 0B-1F linux/ipmi.h conflict! +'i' 00-3F linux/i2o-dev.h conflict! +'i' 0B-1F linux/ipmi.h conflict! 'i' 80-8F linux/i8k.h -'i' 90-9F `linux/iio/*.h` IIO +'i' 90-9F `linux/iio/*.h` IIO 'j' 00-3F linux/joystick.h -'k' 00-0F linux/spi/spidev.h conflict! -'k' 00-05 video/kyro.h conflict! -'k' 10-17 linux/hsi/hsi_char.h HSI character device -'l' 00-3F linux/tcfs_fs.h transparent cryptographic file system - <http://web.archive.org/web/%2A/http://mikonos.dia.unisa.it/tcfs> -'l' 40-7F linux/udf_fs_i.h in development: - <https://github.com/pali/udftools> -'m' 00-09 linux/mmtimer.h conflict! -'m' all linux/mtio.h conflict! -'m' all linux/soundcard.h conflict! -'m' all linux/synclink.h conflict! -'m' 00-19 drivers/message/fusion/mptctl.h conflict! -'m' 00 drivers/scsi/megaraid/megaraid_ioctl.h conflict! +'k' 00-0F linux/spi/spidev.h conflict! +'k' 00-05 video/kyro.h conflict! +'k' 10-17 linux/hsi/hsi_char.h HSI character device +'l' 00-3F linux/tcfs_fs.h transparent cryptographic file system + <http://web.archive.org/web/%2A/http://mikonos.dia.unisa.it/tcfs> +'l' 40-7F linux/udf_fs_i.h in development: + <https://github.com/pali/udftools> +'m' 00-09 linux/mmtimer.h conflict! +'m' all linux/mtio.h conflict! +'m' all linux/soundcard.h conflict! +'m' all linux/synclink.h conflict! +'m' 00-19 drivers/message/fusion/mptctl.h conflict! +'m' 00 drivers/scsi/megaraid/megaraid_ioctl.h conflict! 'n' 00-7F linux/ncp_fs.h and fs/ncpfs/ioctl.c -'n' 80-8F uapi/linux/nilfs2_api.h NILFS2 -'n' E0-FF linux/matroxfb.h matroxfb -'o' 00-1F fs/ocfs2/ocfs2_fs.h OCFS2 -'o' 00-03 mtd/ubi-user.h conflict! (OCFS2 and UBI overlaps) -'o' 40-41 mtd/ubi-user.h UBI -'o' 01-A1 `linux/dvb/*.h` DVB -'p' 00-0F linux/phantom.h conflict! (OpenHaptics needs this) -'p' 00-1F linux/rtc.h conflict! +'n' 80-8F uapi/linux/nilfs2_api.h NILFS2 +'n' E0-FF linux/matroxfb.h matroxfb +'o' 00-1F fs/ocfs2/ocfs2_fs.h OCFS2 +'o' 00-03 mtd/ubi-user.h conflict! (OCFS2 and UBI overlaps) +'o' 40-41 mtd/ubi-user.h UBI +'o' 01-A1 `linux/dvb/*.h` DVB +'p' 00-0F linux/phantom.h conflict! (OpenHaptics needs this) +'p' 00-1F linux/rtc.h conflict! 'p' 40-7F linux/nvram.h -'p' 80-9F linux/ppdev.h user-space parport - <mailto:tim@cyberelk.net> -'p' A1-A5 linux/pps.h LinuxPPS - <mailto:giometti@linux.it> +'p' 80-9F linux/ppdev.h user-space parport + <mailto:tim@cyberelk.net> +'p' A1-A5 linux/pps.h LinuxPPS +'p' B1-B3 linux/pps_gen.h LinuxPPS + <mailto:giometti@linux.it> 'q' 00-1F linux/serio.h -'q' 80-FF linux/telephony.h Internet PhoneJACK, Internet LineJACK - linux/ixjuser.h <http://web.archive.org/web/%2A/http://www.quicknet.net> +'q' 80-FF linux/telephony.h Internet PhoneJACK, Internet LineJACK + linux/ixjuser.h <http://web.archive.org/web/%2A/http://www.quicknet.net> 'r' 00-1F linux/msdos_fs.h and fs/fat/dir.c 's' all linux/cdk.h 't' 00-7F linux/ppp-ioctl.h 't' 80-8F linux/isdn_ppp.h -'t' 90-91 linux/toshiba.h toshiba and toshiba_acpi SMM -'u' 00-1F linux/smb_fs.h gone -'u' 00-2F linux/ublk_cmd.h conflict! -'u' 20-3F linux/uvcvideo.h USB video class host driver -'u' 40-4f linux/udmabuf.h userspace dma-buf misc device -'v' 00-1F linux/ext2_fs.h conflict! -'v' 00-1F linux/fs.h conflict! -'v' 00-0F linux/sonypi.h conflict! -'v' 00-0F media/v4l2-subdev.h conflict! -'v' 20-27 arch/powerpc/include/uapi/asm/vas-api.h VAS API -'v' C0-FF linux/meye.h conflict! -'w' all CERN SCI driver -'y' 00-1F packet based user level communications - <mailto:zapman@interlan.net> -'z' 00-3F CAN bus card conflict! - <mailto:hdstich@connectu.ulm.circular.de> -'z' 40-7F CAN bus card conflict! - <mailto:oe@port.de> -'z' 10-4F drivers/s390/crypto/zcrypt_api.h conflict! +'t' 90-91 linux/toshiba.h toshiba and toshiba_acpi SMM +'u' 00-1F linux/smb_fs.h gone +'u' 00-2F linux/ublk_cmd.h conflict! +'u' 20-3F linux/uvcvideo.h USB video class host driver +'u' 40-4f linux/udmabuf.h userspace dma-buf misc device +'v' 00-1F linux/ext2_fs.h conflict! +'v' 00-1F linux/fs.h conflict! +'v' 00-0F linux/sonypi.h conflict! +'v' 00-0F media/v4l2-subdev.h conflict! +'v' 20-27 arch/powerpc/include/uapi/asm/vas-api.h VAS API +'v' C0-FF linux/meye.h conflict! +'w' all CERN SCI driver +'y' 00-1F packet based user level communications + <mailto:zapman@interlan.net> +'z' 00-3F CAN bus card conflict! + <mailto:hdstich@connectu.ulm.circular.de> +'z' 40-7F CAN bus card conflict! + <mailto:oe@port.de> +'z' 10-4F drivers/s390/crypto/zcrypt_api.h conflict! '|' 00-7F linux/media.h +'|' 80-9F samples/ Any sample and example drivers 0x80 00-1F linux/fb.h 0x81 00-1F linux/vduse.h 0x89 00-06 arch/x86/include/asm/sockios.h 0x89 0B-DF linux/sockios.h -0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range -0x89 F0-FF linux/sockios.h SIOCDEVPRIVATE range +0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range +0x89 F0-FF linux/sockios.h SIOCDEVPRIVATE range 0x8A 00-1F linux/eventpoll.h 0x8B all linux/wireless.h -0x8C 00-3F WiNRADiO driver - <http://www.winradio.com.au/> +0x8C 00-3F WiNRADiO driver + <http://www.winradio.com.au/> 0x90 00 drivers/cdrom/sbpcd.h 0x92 00-0F drivers/usb/mon/mon_bin.c 0x93 60-7F linux/auto_fs.h -0x94 all fs/btrfs/ioctl.h Btrfs filesystem - and linux/fs.h some lifted to vfs/generic -0x97 00-7F fs/ceph/ioctl.h Ceph file system -0x99 00-0F 537-Addinboard driver - <mailto:buk@buks.ipn.de> -0xA0 all linux/sdp/sdp.h Industrial Device Project - <mailto:kenji@bitgate.com> -0xA1 0 linux/vtpm_proxy.h TPM Emulator Proxy Driver -0xA2 all uapi/linux/acrn.h ACRN hypervisor -0xA3 80-8F Port ACL in development: - <mailto:tlewis@mindspring.com> +0x94 all fs/btrfs/ioctl.h Btrfs filesystem + and linux/fs.h some lifted to vfs/generic +0x97 00-7F fs/ceph/ioctl.h Ceph file system +0x99 00-0F 537-Addinboard driver + <mailto:buk@buks.ipn.de> +0x9A 00-0F include/uapi/fwctl/fwctl.h +0xA0 all linux/sdp/sdp.h Industrial Device Project + <mailto:kenji@bitgate.com> +0xA1 0 linux/vtpm_proxy.h TPM Emulator Proxy Driver +0xA2 all uapi/linux/acrn.h ACRN hypervisor +0xA3 80-8F Port ACL in development: + <mailto:tlewis@mindspring.com> 0xA3 90-9F linux/dtlk.h -0xA4 00-1F uapi/linux/tee.h Generic TEE subsystem -0xA4 00-1F uapi/asm/sgx.h <mailto:linux-sgx@vger.kernel.org> -0xA5 01-05 linux/surface_aggregator/cdev.h Microsoft Surface Platform System Aggregator - <mailto:luzmaximilian@gmail.com> -0xA5 20-2F linux/surface_aggregator/dtx.h Microsoft Surface DTX driver - <mailto:luzmaximilian@gmail.com> +0xA4 00-1F uapi/linux/tee.h Generic TEE subsystem +0xA4 00-1F uapi/asm/sgx.h <mailto:linux-sgx@vger.kernel.org> +0xA5 01-05 linux/surface_aggregator/cdev.h Microsoft Surface Platform System Aggregator + <mailto:luzmaximilian@gmail.com> +0xA5 20-2F linux/surface_aggregator/dtx.h Microsoft Surface DTX driver + <mailto:luzmaximilian@gmail.com> 0xAA 00-3F linux/uapi/linux/userfaultfd.h 0xAB 00-1F linux/nbd.h 0xAC 00-1F linux/raw.h -0xAD 00 Netfilter device in development: - <mailto:rusty@rustcorp.com.au> -0xAE 00-1F linux/kvm.h Kernel-based Virtual Machine - <mailto:kvm@vger.kernel.org> -0xAE 40-FF linux/kvm.h Kernel-based Virtual Machine - <mailto:kvm@vger.kernel.org> -0xAE 20-3F linux/nitro_enclaves.h Nitro Enclaves -0xAF 00-1F linux/fsl_hypervisor.h Freescale hypervisor -0xB0 all RATIO devices in development: - <mailto:vgo@ratio.de> -0xB1 00-1F PPPoX - <mailto:mostrows@styx.uwaterloo.ca> -0xB2 00 arch/powerpc/include/uapi/asm/papr-vpd.h powerpc/pseries VPD API - <mailto:linuxppc-dev> -0xB2 01-02 arch/powerpc/include/uapi/asm/papr-sysparm.h powerpc/pseries system parameter API - <mailto:linuxppc-dev> +0xAD 00 Netfilter device in development: + <mailto:rusty@rustcorp.com.au> +0xAE 00-1F linux/kvm.h Kernel-based Virtual Machine + <mailto:kvm@vger.kernel.org> +0xAE 40-FF linux/kvm.h Kernel-based Virtual Machine + <mailto:kvm@vger.kernel.org> +0xAE 20-3F linux/nitro_enclaves.h Nitro Enclaves +0xAF 00-1F linux/fsl_hypervisor.h Freescale hypervisor +0xB0 all RATIO devices in development: + <mailto:vgo@ratio.de> +0xB1 00-1F PPPoX + <mailto:mostrows@styx.uwaterloo.ca> +0xB2 00 arch/powerpc/include/uapi/asm/papr-vpd.h powerpc/pseries VPD API + <mailto:linuxppc-dev@lists.ozlabs.org> +0xB2 01-02 arch/powerpc/include/uapi/asm/papr-sysparm.h powerpc/pseries system parameter API + <mailto:linuxppc-dev@lists.ozlabs.org> +0xB2 03-05 arch/powerpc/include/uapi/asm/papr-indices.h powerpc/pseries indices API + <mailto:linuxppc-dev@lists.ozlabs.org> +0xB2 06-07 arch/powerpc/include/uapi/asm/papr-platform-dump.h powerpc/pseries Platform Dump API + <mailto:linuxppc-dev@lists.ozlabs.org> +0xB2 08 arch/powerpc/include/uapi/asm/papr-physical-attestation.h powerpc/pseries Physical Attestation API + <mailto:linuxppc-dev@lists.ozlabs.org> 0xB3 00 linux/mmc/ioctl.h -0xB4 00-0F linux/gpio.h <mailto:linux-gpio@vger.kernel.org> -0xB5 00-0F uapi/linux/rpmsg.h <mailto:linux-remoteproc@vger.kernel.org> +0xB4 00-0F linux/gpio.h <mailto:linux-gpio@vger.kernel.org> +0xB5 00-0F uapi/linux/rpmsg.h <mailto:linux-remoteproc@vger.kernel.org> 0xB6 all linux/fpga-dfl.h -0xB7 all uapi/linux/remoteproc_cdev.h <mailto:linux-remoteproc@vger.kernel.org> -0xB7 all uapi/linux/nsfs.h <mailto:Andrei Vagin <avagin@openvz.org>> -0xB8 01-02 uapi/misc/mrvl_cn10k_dpi.h Marvell CN10K DPI driver +0xB7 all uapi/linux/remoteproc_cdev.h <mailto:linux-remoteproc@vger.kernel.org> +0xB7 all uapi/linux/nsfs.h <mailto:Andrei Vagin <avagin@openvz.org>> +0xB8 01-02 uapi/misc/mrvl_cn10k_dpi.h Marvell CN10K DPI driver +0xB8 all uapi/linux/mshv.h Microsoft Hyper-V /dev/mshv driver + <mailto:linux-hyperv@vger.kernel.org> 0xC0 00-0F linux/usb/iowarrior.h -0xCA 00-0F uapi/misc/cxl.h +0xCA 00-0F uapi/misc/cxl.h Dead since 6.15 0xCA 10-2F uapi/misc/ocxl.h -0xCA 80-BF uapi/scsi/cxlflash_ioctl.h -0xCB 00-1F CBM serial IEC bus in development: - <mailto:michael.klein@puffin.lb.shuttle.de> -0xCC 00-0F drivers/misc/ibmvmc.h pseries VMC driver -0xCD 01 linux/reiserfs_fs.h -0xCE 01-02 uapi/linux/cxl_mem.h Compute Express Link Memory Devices +0xCA 80-BF uapi/scsi/cxlflash_ioctl.h Dead since 6.15 +0xCB 00-1F CBM serial IEC bus in development: + <mailto:michael.klein@puffin.lb.shuttle.de> +0xCC 00-0F drivers/misc/ibmvmc.h pseries VMC driver +0xCD 01 linux/reiserfs_fs.h Dead since 6.13 +0xCE 01-02 uapi/linux/cxl_mem.h Compute Express Link Memory Devices 0xCF 02 fs/smb/client/cifs_ioctl.h 0xDB 00-0F drivers/char/mwave/mwavepub.h -0xDD 00-3F ZFCP device driver see drivers/s390/scsi/ - <mailto:aherrman@de.ibm.com> +0xDD 00-3F ZFCP device driver see drivers/s390/scsi/ + <mailto:aherrman@de.ibm.com> 0xE5 00-3F linux/fuse.h -0xEC 00-01 drivers/platform/chrome/cros_ec_dev.h ChromeOS EC driver -0xEE 00-09 uapi/linux/pfrut.h Platform Firmware Runtime Update and Telemetry -0xF3 00-3F drivers/usb/misc/sisusbvga/sisusb.h sisfb (in development) - <mailto:thomas@winischhofer.net> -0xF6 all LTTng Linux Trace Toolkit Next Generation - <mailto:mathieu.desnoyers@efficios.com> -0xF8 all arch/x86/include/uapi/asm/amd_hsmp.h AMD HSMP EPYC system management interface driver - <mailto:nchatrad@amd.com> +0xEC 00-01 drivers/platform/chrome/cros_ec_dev.h ChromeOS EC driver +0xEE 00-09 uapi/linux/pfrut.h Platform Firmware Runtime Update and Telemetry +0xF3 00-3F drivers/usb/misc/sisusbvga/sisusb.h sisfb (in development) + <mailto:thomas@winischhofer.net> +0xF6 all LTTng Linux Trace Toolkit Next Generation + <mailto:mathieu.desnoyers@efficios.com> +0xF8 all arch/x86/include/uapi/asm/amd_hsmp.h AMD HSMP EPYC system management interface driver + <mailto:nchatrad@amd.com> +0xF9 00-0F uapi/misc/amd-apml.h AMD side band system management interface driver + <mailto:naveenkrishna.chatradhi@amd.com> 0xFD all linux/dm-ioctl.h 0xFE all linux/isst_if.h -==== ===== ======================================================= ================================================================ +==== ===== ========================================================= ================================================================ diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst index aa004faed5fd..f1c4d21e5c5e 100644 --- a/Documentation/userspace-api/iommufd.rst +++ b/Documentation/userspace-api/iommufd.rst @@ -41,46 +41,159 @@ Following IOMMUFD objects are exposed to userspace: - IOMMUFD_OBJ_DEVICE, representing a device that is bound to iommufd by an external driver. -- IOMMUFD_OBJ_HW_PAGETABLE, representing an actual hardware I/O page table - (i.e. a single struct iommu_domain) managed by the iommu driver. - - The IOAS has a list of HW_PAGETABLES that share the same IOVA mapping and - it will synchronize its mapping with each member HW_PAGETABLE. +- IOMMUFD_OBJ_HWPT_PAGING, representing an actual hardware I/O page table + (i.e. a single struct iommu_domain) managed by the iommu driver. "PAGING" + primarily indicates this type of HWPT should be linked to an IOAS. It also + indicates that it is backed by an iommu_domain with __IOMMU_DOMAIN_PAGING + feature flag. This can be either an UNMANAGED stage-1 domain for a device + running in the user space, or a nesting parent stage-2 domain for mappings + from guest-level physical addresses to host-level physical addresses. + + The IOAS has a list of HWPT_PAGINGs that share the same IOVA mapping and + it will synchronize its mapping with each member HWPT_PAGING. + +- IOMMUFD_OBJ_HWPT_NESTED, representing an actual hardware I/O page table + (i.e. a single struct iommu_domain) managed by user space (e.g. guest OS). + "NESTED" indicates that this type of HWPT should be linked to an HWPT_PAGING. + It also indicates that it is backed by an iommu_domain that has a type of + IOMMU_DOMAIN_NESTED. This must be a stage-1 domain for a device running in + the user space (e.g. in a guest VM enabling the IOMMU nested translation + feature.) As such, it must be created with a given nesting parent stage-2 + domain to associate to. This nested stage-1 page table managed by the user + space usually has mappings from guest-level I/O virtual addresses to guest- + level physical addresses. + +- IOMMUFD_FAULT, representing a software queue for an HWPT reporting IO page + faults using the IOMMU HW's PRI (Page Request Interface). This queue object + provides user space an FD to poll the page fault events and also to respond + to those events. A FAULT object must be created first to get a fault_id that + could be then used to allocate a fault-enabled HWPT via the IOMMU_HWPT_ALLOC + command by setting the IOMMU_HWPT_FAULT_ID_VALID bit in its flags field. + +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance, + passed to or shared with a VM. It may be some HW-accelerated virtualization + features and some SW resources used by the VM. For examples: + + * Security namespace for guest owned ID, e.g. guest-controlled cache tags + * Non-device-affiliated event reporting, e.g. invalidation queue errors + * Access to a shareable nesting parent pagetable across physical IOMMUs + * Virtualization of various platforms IDs, e.g. RIDs and others + * Delivery of paravirtualized invalidation + * Direct assigned invalidation queues + * Direct assigned interrupts + + Such a vIOMMU object generally has the access to a nesting parent pagetable + to support some HW-accelerated virtualization features. So, a vIOMMU object + must be created given a nesting parent HWPT_PAGING object, and then it would + encapsulate that HWPT_PAGING object. Therefore, a vIOMMU object can be used + to allocate an HWPT_NESTED object in place of the encapsulated HWPT_PAGING. + + .. note:: + + The name "vIOMMU" isn't necessarily identical to a virtualized IOMMU in a + VM. A VM can have one giant virtualized IOMMU running on a machine having + multiple physical IOMMUs, in which case the VMM will dispatch the requests + or configurations from this single virtualized IOMMU instance to multiple + vIOMMU objects created for individual slices of different physical IOMMUs. + In other words, a vIOMMU object is always a representation of one physical + IOMMU, not necessarily of a virtualized IOMMU. For VMMs that want the full + virtualization features from physical IOMMUs, it is suggested to build the + same number of virtualized IOMMUs as the number of physical IOMMUs, so the + passed-through devices would be connected to their own virtualized IOMMUs + backed by corresponding vIOMMU objects, in which case a guest OS would do + the "dispatch" naturally instead of VMM trappings. + +- IOMMUFD_OBJ_VDEVICE, representing a virtual device for an IOMMUFD_OBJ_DEVICE + against an IOMMUFD_OBJ_VIOMMU. This virtual device holds the device's virtual + information or attributes (related to the vIOMMU) in a VM. An immediate vDATA + example can be the virtual ID of the device on a vIOMMU, which is a unique ID + that VMM assigns to the device for a translation channel/port of the vIOMMU, + e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vRID of Intel VT-d to a + Context Table. Potential use cases of some advanced security information can + be forwarded via this object too, such as security level or realm information + in a Confidential Compute Architecture. A VMM should create a vDEVICE object + to forward all the device information in a VM, when it connects a device to a + vIOMMU, which is a separate ioctl call from attaching the same device to an + HWPT_PAGING that the vIOMMU holds. + +- IOMMUFD_OBJ_VEVENTQ, representing a software queue for a vIOMMU to report its + events such as translation faults occurred to a nested stage-1 (excluding I/O + page faults that should go through IOMMUFD_OBJ_FAULT) and HW-specific events. + This queue object provides user space an FD to poll/read the vIOMMU events. A + vIOMMU object must be created first to get its viommu_id, which could be then + used to allocate a vEVENTQ. Each vIOMMU can support multiple types of vEVENTS, + but is confined to one vEVENTQ per vEVENTQ type. + +- IOMMUFD_OBJ_HW_QUEUE, representing a hardware accelerated queue, as a subset + of IOMMU's virtualization features, for the IOMMU HW to directly read or write + the virtual queue memory owned by a guest OS. This HW-acceleration feature can + allow VM to work with the IOMMU HW directly without a VM Exit, so as to reduce + overhead from the hypercalls. Along with the HW QUEUE object, iommufd provides + user space an mmap interface for VMM to mmap a physical MMIO region from the + host physical address space to the guest physical address space, allowing the + guest OS to directly control the allocated HW QUEUE. Thus, when allocating a + HW QUEUE, the VMM must request a pair of mmap info (offset/length) and pass in + exactly to an mmap syscall via its offset and length arguments. All user-visible objects are destroyed via the IOMMU_DESTROY uAPI. -The diagram below shows relationship between user-visible objects and kernel +The diagrams below show relationships between user-visible objects and kernel datastructures (external to iommufd), with numbers referred to operations creating the objects and links:: - _________________________________________________________ - | iommufd | - | [1] | - | _________________ | - | | | | - | | | | - | | | | - | | | | - | | | | - | | | | - | | | [3] [2] | - | | | ____________ __________ | - | | IOAS |<--| |<------| | | - | | | |HW_PAGETABLE| | DEVICE | | - | | | |____________| |__________| | - | | | | | | - | | | | | | - | | | | | | - | | | | | | - | | | | | | - | |_________________| | | | - | | | | | - |_________|___________________|___________________|_______| - | | | - | _____v______ _______v_____ - | PFN storage | | | | - |------------>|iommu_domain| |struct device| - |____________| |_____________| + _______________________________________________________________________ + | iommufd (HWPT_PAGING only) | + | | + | [1] [3] [2] | + | ________________ _____________ ________ | + | | | | | | | | + | | IOAS |<---| HWPT_PAGING |<---------------------| DEVICE | | + | |________________| |_____________| |________| | + | | | | | + |_________|____________________|__________________________________|_____| + | | | + | ______v_____ ___v__ + | PFN storage | (paging) | |struct| + |------------>|iommu_domain|<-----------------------|device| + |____________| |______| + + _______________________________________________________________________ + | iommufd (with HWPT_NESTED) | + | | + | [1] [3] [4] [2] | + | ________________ _____________ _____________ ________ | + | | | | | | | | | | + | | IOAS |<---| HWPT_PAGING |<---| HWPT_NESTED |<--| DEVICE | | + | |________________| |_____________| |_____________| |________| | + | | | | | | + |_________|____________________|__________________|_______________|_____| + | | | | + | ______v_____ ______v_____ ___v__ + | PFN storage | (paging) | | (nested) | |struct| + |------------>|iommu_domain|<----|iommu_domain|<----|device| + |____________| |____________| |______| + + _______________________________________________________________________ + | iommufd (with vIOMMU/vDEVICE) | + | | + | [5] [6] | + | _____________ _____________ | + | | | | | | + | |----------------| vIOMMU |<---| vDEVICE |<----| | + | | | | |_____________| | | + | | | | | | + | | [1] | | [4] | [2] | + | | ______ | | _____________ _|______ | + | | | | | [3] | | | | | | + | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | + | | |______| |_____________| |_____________| |________| | + | | | | | | | + |______|________|______________|__________________|_______________|_____| + | | | | | + ______v_____ | ______v_____ ______v_____ ___v__ + | struct | | PFN | (paging) | | (nested) | |struct| + |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| + |____________| storage|____________| |____________| |______| 1. IOMMUFD_OBJ_IOAS is created via the IOMMU_IOAS_ALLOC uAPI. An iommufd can hold multiple IOAS objects. IOAS is the most generic object and does not @@ -94,21 +207,63 @@ creating the objects and links:: device. The driver must also set the driver_managed_dma flag and must not touch the device until this operation succeeds. -3. IOMMUFD_OBJ_HW_PAGETABLE is created when an external driver calls the IOMMUFD - kAPI to attach a bound device to an IOAS. Similarly the external driver uAPI - allows userspace to initiate the attaching operation. If a compatible - pagetable already exists then it is reused for the attachment. Otherwise a - new pagetable object and iommu_domain is created. Successful completion of - this operation sets up the linkages among IOAS, device and iommu_domain. Once - this completes the device could do DMA. - - Every iommu_domain inside the IOAS is also represented to userspace as a - HW_PAGETABLE object. +3. IOMMUFD_OBJ_HWPT_PAGING can be created in two ways: + + * IOMMUFD_OBJ_HWPT_PAGING is automatically created when an external driver + calls the IOMMUFD kAPI to attach a bound device to an IOAS. Similarly the + external driver uAPI allows userspace to initiate the attaching operation. + If a compatible member HWPT_PAGING object exists in the IOAS's HWPT_PAGING + list, then it will be reused. Otherwise a new HWPT_PAGING that represents + an iommu_domain to userspace will be created, and then added to the list. + Successful completion of this operation sets up the linkages among IOAS, + device and iommu_domain. Once this completes the device could do DMA. + + * IOMMUFD_OBJ_HWPT_PAGING can be manually created via the IOMMU_HWPT_ALLOC + uAPI, provided an ioas_id via @pt_id to associate the new HWPT_PAGING to + the corresponding IOAS object. The benefit of this manual allocation is to + allow allocation flags (defined in enum iommufd_hwpt_alloc_flags), e.g. it + allocates a nesting parent HWPT_PAGING if the IOMMU_HWPT_ALLOC_NEST_PARENT + flag is set. + +4. IOMMUFD_OBJ_HWPT_NESTED can be only manually created via the IOMMU_HWPT_ALLOC + uAPI, provided an hwpt_id or a viommu_id of a vIOMMU object encapsulating a + nesting parent HWPT_PAGING via @pt_id to associate the new HWPT_NESTED object + to the corresponding HWPT_PAGING object. The associating HWPT_PAGING object + must be a nesting parent manually allocated via the same uAPI previously with + an IOMMU_HWPT_ALLOC_NEST_PARENT flag, otherwise the allocation will fail. The + allocation will be further validated by the IOMMU driver to ensure that the + nesting parent domain and the nested domain being allocated are compatible. + Successful completion of this operation sets up linkages among IOAS, device, + and iommu_domains. Once this completes the device could do DMA via a 2-stage + translation, a.k.a nested translation. Note that multiple HWPT_NESTED objects + can be allocated by (and then associated to) the same nesting parent. .. note:: - Future IOMMUFD updates will provide an API to create and manipulate the - HW_PAGETABLE directly. + Either a manual IOMMUFD_OBJ_HWPT_PAGING or an IOMMUFD_OBJ_HWPT_NESTED is + created via the same IOMMU_HWPT_ALLOC uAPI. The difference is at the type + of the object passed in via the @pt_id field of struct iommufd_hwpt_alloc. + +5. IOMMUFD_OBJ_VIOMMU can be only manually created via the IOMMU_VIOMMU_ALLOC + uAPI, provided a dev_id (for the device's physical IOMMU to back the vIOMMU) + and an hwpt_id (to associate the vIOMMU to a nesting parent HWPT_PAGING). The + iommufd core will link the vIOMMU object to the struct iommu_device that the + struct device is behind. And an IOMMU driver can implement a viommu_alloc op + to allocate its own vIOMMU data structure embedding the core-level structure + iommufd_viommu and some driver-specific data. If necessary, the driver can + also configure its HW virtualization feature for that vIOMMU (and thus for + the VM). Successful completion of this operation sets up the linkages between + the vIOMMU object and the HWPT_PAGING, then this vIOMMU object can be used + as a nesting parent object to allocate an HWPT_NESTED object described above. + +6. IOMMUFD_OBJ_VDEVICE can be only manually created via the IOMMU_VDEVICE_ALLOC + uAPI, provided a viommu_id for an iommufd_viommu object and a dev_id for an + iommufd_device object. The vDEVICE object will be the binding between these + two parent objects. Another @virt_id will be also set via the uAPI providing + the iommufd core an index to store the vDEVICE object to a vDEVICE array per + vIOMMU. If necessary, the IOMMU driver may choose to implement a vdevce_alloc + op to init its HW for virtualization feature related to a vDEVICE. Successful + completion of this operation sets up the linkages between vIOMMU and device. A device can only bind to an iommufd due to DMA ownership claim and attach to at most one IOAS object (no support of PASID yet). @@ -120,7 +275,13 @@ User visible objects are backed by following datastructures: - iommufd_ioas for IOMMUFD_OBJ_IOAS. - iommufd_device for IOMMUFD_OBJ_DEVICE. -- iommufd_hw_pagetable for IOMMUFD_OBJ_HW_PAGETABLE. +- iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING. +- iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED. +- iommufd_fault for IOMMUFD_OBJ_FAULT. +- iommufd_viommu for IOMMUFD_OBJ_VIOMMU. +- iommufd_vdevice for IOMMUFD_OBJ_VDEVICE. +- iommufd_veventq for IOMMUFD_OBJ_VEVENTQ. +- iommufd_hw_queue for IOMMUFD_OBJ_HW_QUEUE. Several terminologies when looking at these datastructures: diff --git a/Documentation/userspace-api/landlock.rst b/Documentation/userspace-api/landlock.rst index d639c61cb472..1d0c2c15c22e 100644 --- a/Documentation/userspace-api/landlock.rst +++ b/Documentation/userspace-api/landlock.rst @@ -8,7 +8,7 @@ Landlock: unprivileged access control ===================================== :Author: Mickaël Salaün -:Date: October 2024 +:Date: March 2025 The goal of Landlock is to enable restriction of ambient rights (e.g. global filesystem or network access) for a set of processes. Because Landlock @@ -317,33 +317,32 @@ IPC scoping ----------- Similar to the implicit `Ptrace restrictions`_, we may want to further restrict -interactions between sandboxes. Each Landlock domain can be explicitly scoped -for a set of actions by specifying it on a ruleset. For example, if a -sandboxed process should not be able to :manpage:`connect(2)` to a -non-sandboxed process through abstract :manpage:`unix(7)` sockets, we can -specify such a restriction with ``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET``. -Moreover, if a sandboxed process should not be able to send a signal to a -non-sandboxed process, we can specify this restriction with -``LANDLOCK_SCOPE_SIGNAL``. - -A sandboxed process can connect to a non-sandboxed process when its domain is -not scoped. If a process's domain is scoped, it can only connect to sockets -created by processes in the same scope. -Moreover, If a process is scoped to send signal to a non-scoped process, it can -only send signals to processes in the same scope. - -A connected datagram socket behaves like a stream socket when its domain is -scoped, meaning if the domain is scoped after the socket is connected , it can -still :manpage:`send(2)` data just like a stream socket. However, in the same -scenario, a non-connected datagram socket cannot send data (with -:manpage:`sendto(2)`) outside its scope. - -A process with a scoped domain can inherit a socket created by a non-scoped -process. The process cannot connect to this socket since it has a scoped -domain. - -IPC scoping does not support exceptions, so if a domain is scoped, no rules can -be added to allow access to resources or processes outside of the scope. +interactions between sandboxes. Therefore, at ruleset creation time, each +Landlock domain can restrict the scope for certain operations, so that these +operations can only reach out to processes within the same Landlock domain or in +a nested Landlock domain (the "scope"). + +The operations which can be scoped are: + +``LANDLOCK_SCOPE_SIGNAL`` + This limits the sending of signals to target processes which run within the + same or a nested Landlock domain. + +``LANDLOCK_SCOPE_ABSTRACT_UNIX_SOCKET`` + This limits the set of abstract :manpage:`unix(7)` sockets to which we can + :manpage:`connect(2)` to socket addresses which were created by a process in + the same or a nested Landlock domain. + + A :manpage:`sendto(2)` on a non-connected datagram socket is treated as if + it were doing an implicit :manpage:`connect(2)` and will be blocked if the + remote end does not stem from the same or a nested Landlock domain. + + A :manpage:`sendto(2)` on a socket which was previously connected will not + be restricted. This works for both datagram and stream sockets. + +IPC scoping does not support exceptions via :manpage:`landlock_add_rule(2)`. +If an operation is scoped within a domain, no rules can be added to allow access +to resources or processes outside of the scope. Truncating files ---------------- @@ -595,6 +594,16 @@ Starting with the Landlock ABI version 6, it is possible to restrict :manpage:`signal(7)` sending by setting ``LANDLOCK_SCOPE_SIGNAL`` to the ``scoped`` ruleset attribute. +Logging (ABI < 7) +----------------- + +Starting with the Landlock ABI version 7, it is possible to control logging of +Landlock audit events with the ``LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF``, +``LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON``, and +``LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF`` flags passed to +sys_landlock_restrict_self(). See Documentation/admin-guide/LSM/landlock.rst +for more details on audit. + .. _kernel_support: Kernel support @@ -683,9 +692,16 @@ fine-grained restrictions). Moreover, their complexity can lead to security issues, especially when untrusted processes can manipulate them (cf. `Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_). +How to disable Landlock audit records? +-------------------------------------- + +You might want to put in place filters as explained here: +Documentation/admin-guide/LSM/landlock.rst + Additional documentation ======================== +* Documentation/admin-guide/LSM/landlock.rst * Documentation/security/landlock.rst * https://landlock.io diff --git a/Documentation/userspace-api/media/cec/cec-pin-error-inj.rst b/Documentation/userspace-api/media/cec/cec-pin-error-inj.rst index 411d42a742f3..c02790319f3f 100644 --- a/Documentation/userspace-api/media/cec/cec-pin-error-inj.rst +++ b/Documentation/userspace-api/media/cec/cec-pin-error-inj.rst @@ -41,6 +41,9 @@ error injection status:: # <op> rx-clear clear all rx error injections for <op> # <op> tx-clear clear all tx error injections for <op> # + # RX error injection settings: + # rx-no-low-drive do not generate low-drive pulses + # # RX error injection: # <op>[,<mode>] rx-nack NACK the message instead of sending an ACK # <op>[,<mode>] rx-low-drive <bit> force a low-drive condition at this bit position @@ -53,6 +56,10 @@ error injection status:: # tx-custom-low-usecs <usecs> define the 'low' time for the custom pulse # tx-custom-high-usecs <usecs> define the 'high' time for the custom pulse # tx-custom-pulse transmit the custom pulse once the bus is idle + # tx-glitch-low-usecs <usecs> define the 'low' time for the glitch pulse + # tx-glitch-high-usecs <usecs> define the 'high' time for the glitch pulse + # tx-glitch-falling-edge send the glitch pulse after every falling edge + # tx-glitch-rising-edge send the glitch pulse after every rising edge # # TX error injection: # <op>[,<mode>] tx-no-eom don't set the EOM bit @@ -193,6 +200,14 @@ Receive Messages This does not work if the remote CEC transmitter has logical address 0 ('TV') since that will always win. +``rx-no-low-drive`` + The receiver will ignore situations that would normally generate a + Low Drive pulse (3.6 ms). This is typically done if a spurious pulse is + detected when receiving a message, and it indicates to the transmitter that + the message has to be retransmitted since the receiver got confused. + Disabling this is useful to test how other CEC devices handle glitches + by ensuring we will not be the one that generates a Low Drive. + Transmit Messages ----------------- @@ -327,3 +342,30 @@ Custom Pulses ``tx-custom-pulse`` Transmit a single custom pulse as soon as the CEC bus is idle. + +Glitch Pulses +------------- + +This emulates what happens if the signal on the CEC line is seeing spurious +pulses. Typically this happens after the falling or rising edge where there +is a short voltage fluctuation that, if the CEC hardware doesn't do +deglitching, can be seen as a spurious pulse and can cause a Low Drive +condition or corrupt data. + +``tx-glitch-low-usecs <usecs>`` + This defines the duration in microseconds that the glitch pulse pulls + the CEC line low. The default is 1 microsecond. The range is 0-100 + microseconds. If 0, then no glitch pulse will be generated. + +``tx-glitch-high-usecs <usecs>`` + This defines the duration in microseconds that the glitch pulse keeps the + CEC line high (unless another CEC adapter pulls it low in that time). + The default is 1 microseconds. The range is 0-100 microseconds. If 0, then + no glitch pulse will be generated.The total period of the glitch pulse is + ``tx-custom-low-usecs + tx-custom-high-usecs``. + +``tx-glitch-falling-edge`` + Send the glitch pulse right after the falling edge. + +``tx-glitch-rising-edge`` + Send the glitch pulse right after the rising edge. diff --git a/Documentation/userspace-api/media/drivers/uvcvideo.rst b/Documentation/userspace-api/media/drivers/uvcvideo.rst index a290f9fadae9..dbb30ad389ae 100644 --- a/Documentation/userspace-api/media/drivers/uvcvideo.rst +++ b/Documentation/userspace-api/media/drivers/uvcvideo.rst @@ -181,6 +181,7 @@ Argument: struct uvc_xu_control_mapping UVC_CTRL_DATA_TYPE_BOOLEAN Boolean UVC_CTRL_DATA_TYPE_ENUM Enumeration UVC_CTRL_DATA_TYPE_BITMASK Bitmask + UVC_CTRL_DATA_TYPE_RECT Rectangular area UVCIOC_CTRL_QUERY - Query a UVC XU control @@ -255,3 +256,66 @@ Argument: struct uvc_xu_control_query __u8 query Request code to send to the device __u16 size Control data size (in bytes) __u8 *data Control value + + +Driver-specific V4L2 controls +----------------------------- + +The uvcvideo driver implements the following UVC-specific controls: + +``V4L2_CID_UVC_REGION_OF_INTEREST_RECT (struct)`` + This control determines the region of interest (ROI). ROI is a + rectangular area represented by a struct :c:type:`v4l2_rect`. The + rectangle is in global sensor coordinates using pixel units. It is + independent of the field of view, not impacted by any cropping or + scaling. + + Use ``V4L2_CTRL_WHICH_MIN_VAL`` and ``V4L2_CTRL_WHICH_MAX_VAL`` to query + the range of rectangle sizes. + + Setting a ROI allows the camera to optimize the capture for the region. + The value of ``V4L2_CID_REGION_OF_INTEREST_AUTO`` control determines + the detailed behavior. + + An example of use of this control, can be found in the: + `Chrome OS USB camera HAL. + <https://chromium.googlesource.com/chromiumos/platform2/+/refs/heads/release-R121-15699.B/camera/hal/usb/>` + + +``V4L2_CID_UVC_REGION_OF_INTEREST_AUTO (bitmask)`` + This determines which, if any, on-board features should track to the + Region of Interest specified by the current value of + ``V4L2_CID_UVD__REGION_OF_INTEREST_RECT``. + + Max value is a mask indicating all supported Auto Controls. + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + + * - ``V4L2_UVC_REGION_OF_INTEREST_AUTO_EXPOSURE`` + - Setting this bit causes automatic exposure to track the region of + interest instead of the whole image. + * - ``V4L2_UVC_REGION_OF_INTEREST_AUTO_IRIS`` + - Setting this bit causes automatic iris to track the region of interest + instead of the whole image. + * - ``V4L2_UVC_REGION_OF_INTEREST_AUTO_WHITE_BALANCE`` + - Setting this bit causes automatic white balance to track the region + of interest instead of the whole image. + * - ``V4L2_UVC_REGION_OF_INTEREST_AUTO_FOCUS`` + - Setting this bit causes automatic focus adjustment to track the region + of interest instead of the whole image. + * - ``V4L2_UVC_REGION_OF_INTEREST_AUTO_FACE_DETECT`` + - Setting this bit causes automatic face detection to track the region of + interest instead of the whole image. + * - ``V4L2_UVC_REGION_OF_INTEREST_AUTO_DETECT_AND_TRACK`` + - Setting this bit enables automatic face detection and tracking. The + current value of ``V4L2_CID_REGION_OF_INTEREST_RECT`` may be updated by + the driver. + * - ``V4L2_UVC_REGION_OF_INTEREST_AUTO_IMAGE_STABILIZATION`` + - Setting this bit enables automatic image stabilization. The + current value of ``V4L2_CID_REGION_OF_INTEREST_RECT`` may be updated by + the driver. + * - ``V4L2_UVC_REGION_OF_INTEREST_AUTO_HIGHER_QUALITY`` + - Setting this bit enables automatically capture the specified region + with higher quality if possible. diff --git a/Documentation/userspace-api/media/rc/lirc-set-send-duty-cycle.rst b/Documentation/userspace-api/media/rc/lirc-set-send-duty-cycle.rst index 2979752acbcd..a94750d00898 100644 --- a/Documentation/userspace-api/media/rc/lirc-set-send-duty-cycle.rst +++ b/Documentation/userspace-api/media/rc/lirc-set-send-duty-cycle.rst @@ -27,7 +27,7 @@ Arguments File descriptor returned by open(). ``duty_cycle`` - Duty cicle, describing the pulse width in percent (from 1 to 99) of + Duty cycle, describing the pulse width in percent (from 1 to 99) of the total cycle. Values 0 and 100 are reserved. Description diff --git a/Documentation/userspace-api/media/rc/rc-protos.rst b/Documentation/userspace-api/media/rc/rc-protos.rst index 2a888ff5829f..ec706290c921 100644 --- a/Documentation/userspace-api/media/rc/rc-protos.rst +++ b/Documentation/userspace-api/media/rc/rc-protos.rst @@ -449,6 +449,6 @@ the 32 bits. xbox-dvd (RC_PROTO_XBOX_DVD) ---------------------------- -This protocol is used by XBox DVD Remote, which was made for the original -XBox. There is no in-kernel decoder or encoder for this protocol. The usb +This protocol is used by Xbox DVD Remote, which was made for the original +Xbox. There is no in-kernel decoder or encoder for this protocol. The usb device decodes the protocol. There is a BPF decoder available in v4l-utils. diff --git a/Documentation/userspace-api/media/rc/rc-sysfs-nodes.rst b/Documentation/userspace-api/media/rc/rc-sysfs-nodes.rst index 34d6a0a1f4d3..70b5966aaff8 100644 --- a/Documentation/userspace-api/media/rc/rc-sysfs-nodes.rst +++ b/Documentation/userspace-api/media/rc/rc-sysfs-nodes.rst @@ -6,7 +6,7 @@ Remote Controller's sysfs nodes ******************************* -As defined at ``Documentation/ABI/testing/sysfs-class-rc``, those are +As defined at Documentation/ABI/testing/sysfs-class-rc, those are the sysfs nodes that control the Remote Controllers: diff --git a/Documentation/userspace-api/media/v4l/biblio.rst b/Documentation/userspace-api/media/v4l/biblio.rst index 35674eeae20d..856acf6a890c 100644 --- a/Documentation/userspace-api/media/v4l/biblio.rst +++ b/Documentation/userspace-api/media/v4l/biblio.rst @@ -150,7 +150,7 @@ ITU-T.81 ======== -:title: ITU-T Recommendation T.81 "Information Technology --- Digital Compression and Coding of Continous-Tone Still Images --- Requirements and Guidelines" +:title: ITU-T Recommendation T.81 "Information Technology --- Digital Compression and Coding of Continuous-Tone Still Images --- Requirements and Guidelines" :author: International Telecommunication Union (http://www.itu.int) diff --git a/Documentation/userspace-api/media/v4l/control.rst b/Documentation/userspace-api/media/v4l/control.rst index 57893814a1e5..9253cc946f02 100644 --- a/Documentation/userspace-api/media/v4l/control.rst +++ b/Documentation/userspace-api/media/v4l/control.rst @@ -290,13 +290,15 @@ Control IDs This is a read-only control that can be read by the application and used as a hint to determine the number of CAPTURE buffers to pass to REQBUFS. The value is the minimum number of CAPTURE buffers that is - necessary for hardware to work. + necessary for hardware to work. This control is required for stateful + decoders. ``V4L2_CID_MIN_BUFFERS_FOR_OUTPUT`` ``(integer)`` This is a read-only control that can be read by the application and used as a hint to determine the number of OUTPUT buffers to pass to REQBUFS. The value is the minimum number of OUTPUT buffers that is - necessary for hardware to work. + necessary for hardware to work. This control is required for stateful + encoders. .. _v4l2-alpha-component: diff --git a/Documentation/userspace-api/media/v4l/dev-sliced-vbi.rst b/Documentation/userspace-api/media/v4l/dev-sliced-vbi.rst index 42cdb0a9f786..96e0e85a822c 100644 --- a/Documentation/userspace-api/media/v4l/dev-sliced-vbi.rst +++ b/Documentation/userspace-api/media/v4l/dev-sliced-vbi.rst @@ -48,7 +48,7 @@ capabilities, and they may support :ref:`control` ioctls. The :ref:`video standard <standard>` ioctls provide information vital to program a sliced VBI device, therefore must be supported. -.. _sliced-vbi-format-negotitation: +.. _sliced-vbi-format-negotiation: Sliced VBI Format Negotiation ============================= @@ -377,7 +377,7 @@ Sliced VBI Data in MPEG Streams If a device can produce an MPEG output stream, it may be capable of providing -:ref:`negotiated sliced VBI services <sliced-vbi-format-negotitation>` +:ref:`negotiated sliced VBI services <sliced-vbi-format-negotiation>` as data embedded in the MPEG stream. Users or applications control this sliced VBI data insertion with the :ref:`V4L2_CID_MPEG_STREAM_VBI_FMT <v4l2-mpeg-stream-vbi-fmt>` diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-fm-rx.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-fm-rx.rst index b6cfc0e823d2..ccd439e9e0e3 100644 --- a/Documentation/userspace-api/media/v4l/ext-ctrls-fm-rx.rst +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-fm-rx.rst @@ -64,17 +64,12 @@ FM_RX Control IDs broadcasts speech. If the transmitter doesn't make this distinction, then it will be set. -``V4L2_CID_TUNE_DEEMPHASIS`` - (enum) - -enum v4l2_deemphasis - +``V4L2_CID_TUNE_DEEMPHASIS (enum)`` Configures the de-emphasis value for reception. A de-emphasis filter is applied to the broadcast to accentuate the high audio frequencies. Depending on the region, a time constant of either 50 - or 75 useconds is used. The enum v4l2_deemphasis defines possible - values for de-emphasis. Here they are: - - + or 75 microseconds is used. The enum v4l2_deemphasis defines possible + values for de-emphasis. They are: .. flat-table:: :header-rows: 0 diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-fm-tx.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-fm-tx.rst index 04c997c9a4c3..cb40cf4cc3ec 100644 --- a/Documentation/userspace-api/media/v4l/ext-ctrls-fm-tx.rst +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-fm-tx.rst @@ -104,7 +104,7 @@ FM_TX Control IDs ``V4L2_CID_AUDIO_LIMITER_RELEASE_TIME (integer)`` Sets the audio deviation limiter feature release time. Unit is in - useconds. Step and range are driver-specific. + microseconds. Step and range are driver-specific. ``V4L2_CID_AUDIO_LIMITER_DEVIATION (integer)`` Configures audio frequency deviation level in Hz. The range and step @@ -121,16 +121,16 @@ FM_TX Control IDs range and step are driver-specific. ``V4L2_CID_AUDIO_COMPRESSION_THRESHOLD (integer)`` - Sets the threshold level for audio compression freature. It is a dB + Sets the threshold level for audio compression feature. It is a dB value. The range and step are driver-specific. ``V4L2_CID_AUDIO_COMPRESSION_ATTACK_TIME (integer)`` - Sets the attack time for audio compression feature. It is a useconds + Sets the attack time for audio compression feature. It is a microseconds value. The range and step are driver-specific. ``V4L2_CID_AUDIO_COMPRESSION_RELEASE_TIME (integer)`` Sets the release time for audio compression feature. It is a - useconds value. The range and step are driver-specific. + microseconds value. The range and step are driver-specific. ``V4L2_CID_PILOT_TONE_ENABLED (boolean)`` Enables or disables the pilot tone generation feature. @@ -143,17 +143,12 @@ FM_TX Control IDs Configures pilot tone frequency value. Unit is in Hz. The range and step are driver-specific. -``V4L2_CID_TUNE_PREEMPHASIS`` - (enum) - -enum v4l2_preemphasis - +``V4L2_CID_TUNE_PREEMPHASIS (enum)`` Configures the pre-emphasis value for broadcasting. A pre-emphasis filter is applied to the broadcast to accentuate the high audio frequencies. Depending on the region, a time constant of either 50 - or 75 useconds is used. The enum v4l2_preemphasis defines possible - values for pre-emphasis. Here they are: - - + or 75 microseconds is used. The enum v4l2_preemphasis defines possible + values for pre-emphasis. They are: .. flat-table:: :header-rows: 0 @@ -166,8 +161,6 @@ enum v4l2_preemphasis - * - ``V4L2_PREEMPHASIS_75_uS`` - A pre-emphasis of 75 uS is used. - - ``V4L2_CID_TUNE_POWER_LEVEL (integer)`` Sets the output power level for signal transmission. Unit is in dBuV. Range and step are driver-specific. diff --git a/Documentation/userspace-api/media/v4l/meta-formats.rst b/Documentation/userspace-api/media/v4l/meta-formats.rst index c6e56b5888bc..0de80328c36b 100644 --- a/Documentation/userspace-api/media/v4l/meta-formats.rst +++ b/Documentation/userspace-api/media/v4l/meta-formats.rst @@ -12,12 +12,15 @@ These formats are used for the :ref:`metadata` interface only. .. toctree:: :maxdepth: 1 + metafmt-c3-isp metafmt-d4xx metafmt-generic metafmt-intel-ipu3 metafmt-pisp-be + metafmt-pisp-fe metafmt-rkisp1 metafmt-uvc + metafmt-uvc-msxu-1-5 metafmt-vivid metafmt-vsp1-hgo metafmt-vsp1-hgt diff --git a/Documentation/userspace-api/media/v4l/metafmt-c3-isp.rst b/Documentation/userspace-api/media/v4l/metafmt-c3-isp.rst new file mode 100644 index 000000000000..449b45c2ec24 --- /dev/null +++ b/Documentation/userspace-api/media/v4l/metafmt-c3-isp.rst @@ -0,0 +1,86 @@ +.. SPDX-License-Identifier: (GPL-2.0-only OR MIT) + +.. _v4l2-meta-fmt-c3isp-stats: +.. _v4l2-meta-fmt-c3isp-params: + +*********************************************************************** +V4L2_META_FMT_C3ISP_STATS ('C3ST'), V4L2_META_FMT_C3ISP_PARAMS ('C3PM') +*********************************************************************** + +.. c3_isp_stats_info + +3A Statistics +============= + +The C3 ISP can collect different statistics over an input Bayer frame. +Those statistics are obtained from the "c3-isp-stats" metadata capture video nodes, +using the :c:type:`v4l2_meta_format` interface. +They are formatted as described by the :c:type:`c3_isp_stats_info` structure. + +The statistics collected are Auto-white balance, +Auto-exposure and Auto-focus information. + +.. c3_isp_params_cfg + +Configuration Parameters +======================== + +The configuration parameters are passed to the c3-isp-params metadata output video node, +using the :c:type:`v4l2_meta_format` interface. Rather than a single struct containing +sub-structs for each configurable area of the ISP, parameters for the C3-ISP +are defined as distinct structs or "blocks" which may be added to the data +member of :c:type:`c3_isp_params_cfg`. Userspace is responsible for +populating the data member with the blocks that need to be configured by the driver, but +need not populate it with **all** the blocks, or indeed with any at all if there +are no configuration changes to make. Populated blocks **must** be consecutive +in the buffer. To assist both userspace and the driver in identifying the +blocks each block-specific struct embeds +:c:type:`c3_isp_params_block_header` as its first member and userspace +must populate the type member with a value from +:c:type:`c3_isp_params_block_type`. Once the blocks have been populated +into the data buffer, the combined size of all populated blocks shall be set in +the data_size member of :c:type:`c3_isp_params_cfg`. For example: + +.. code-block:: c + + struct c3_isp_params_cfg *params = + (struct c3_isp_params_cfg *)buffer; + + params->version = C3_ISP_PARAM_BUFFER_V0; + params->data_size = 0; + + void *data = (void *)params->data; + + struct c3_isp_params_awb_gains *gains = + (struct c3_isp_params_awb_gains *)data; + + gains->header.type = C3_ISP_PARAMS_BLOCK_AWB_GAINS; + gains->header.flags = C3_ISP_PARAMS_BLOCK_FL_ENABLE; + gains->header.size = sizeof(struct c3_isp_params_awb_gains); + + gains->gr_gain = 256; + gains->r_gain = 256; + gains->b_gain = 256; + gains->gb_gain = 256; + + data += sizeof(struct c3_isp__params_awb_gains); + params->data_size += sizeof(struct c3_isp_params_awb_gains); + + struct c3_isp_params_awb_config *awb_cfg = + (struct c3_isp_params_awb_config *)data; + + awb_cfg->header.type = C3_ISP_PARAMS_BLOCK_AWB_CONFIG; + awb_cfg->header.flags = C3_ISP_PARAMS_BLOCK_FL_ENABLE; + awb_cfg->header.size = sizeof(struct c3_isp_params_awb_config); + + awb_cfg->tap_point = C3_ISP_AWB_STATS_TAP_BEFORE_WB; + awb_cfg->satur = 1; + awb_cfg->horiz_zones_num = 32; + awb_cfg->vert_zones_num = 24; + + params->data_size += sizeof(struct c3_isp_params_awb_config); + +Amlogic C3 ISP uAPI data types +=============================== + +.. kernel-doc:: include/uapi/linux/media/amlogic/c3-isp-config.h diff --git a/Documentation/userspace-api/media/v4l/metafmt-pisp-fe.rst b/Documentation/userspace-api/media/v4l/metafmt-pisp-fe.rst new file mode 100644 index 000000000000..fddeada83e4a --- /dev/null +++ b/Documentation/userspace-api/media/v4l/metafmt-pisp-fe.rst @@ -0,0 +1,39 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _v4l2-meta-fmt-rpi-fe-cfg: + +************************ +V4L2_META_FMT_RPI_FE_CFG +************************ + +Raspberry Pi PiSP Front End configuration format +================================================ + +The Raspberry Pi PiSP Front End image signal processor is configured by +userspace by providing a buffer of configuration parameters to the +`rp1-cfe-fe-config` output video device node using the +:c:type:`v4l2_meta_format` interface. + +The `Raspberry Pi PiSP technical specification +<https://datasheets.raspberrypi.com/camera/raspberry-pi-image-signal-processor-specification.pdf>`_ +provide detailed description of the Front End configuration and programming +model. + +.. _v4l2-meta-fmt-rpi-fe-stats: + +************************** +V4L2_META_FMT_RPI_FE_STATS +************************** + +Raspberry Pi PiSP Front End statistics format +============================================= + +The Raspberry Pi PiSP Front End image signal processor provides statistics data +by writing to a buffer provided via the `rp1-cfe-fe-stats` capture video device +node using the +:c:type:`v4l2_meta_format` interface. + +The `Raspberry Pi PiSP technical specification +<https://datasheets.raspberrypi.com/camera/raspberry-pi-image-signal-processor-specification.pdf>`_ +provide detailed description of the Front End configuration and programming +model. diff --git a/Documentation/userspace-api/media/v4l/metafmt-uvc-msxu-1-5.rst b/Documentation/userspace-api/media/v4l/metafmt-uvc-msxu-1-5.rst new file mode 100644 index 000000000000..dd1c3076df24 --- /dev/null +++ b/Documentation/userspace-api/media/v4l/metafmt-uvc-msxu-1-5.rst @@ -0,0 +1,23 @@ +.. SPDX-License-Identifier: GFDL-1.1-no-invariants-or-later + +.. _v4l2-meta-fmt-uvc-msxu-1-5: + +*********************************** +V4L2_META_FMT_UVC_MSXU_1_5 ('UVCM') +*********************************** + +Microsoft(R)'s UVC Payload Metadata. + + +Description +=========== + +V4L2_META_FMT_UVC_MSXU_1_5 buffers follow the metadata buffer layout of +V4L2_META_FMT_UVC with the only difference that it includes all the UVC +metadata in the `buffer[]` field, not just the first 2-12 bytes. + +The metadata format follows the specification from Microsoft(R) [1]. + +.. _1: + +[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/stream/uvc-extensions-1-5 diff --git a/Documentation/userspace-api/media/v4l/metafmt-uvc.rst b/Documentation/userspace-api/media/v4l/metafmt-uvc.rst index 784346d14bbd..4c05e9e54683 100644 --- a/Documentation/userspace-api/media/v4l/metafmt-uvc.rst +++ b/Documentation/userspace-api/media/v4l/metafmt-uvc.rst @@ -44,7 +44,9 @@ Each individual block contains the following fields: them * - :cspan:`1` *The rest is an exact copy of the UVC payload header:* * - __u8 length; - - length of the rest of the block, including this field + - length of the rest of the block, including this field. Please note that + regardless of this value, for V4L2_META_FMT_UVC the kernel will never + copy more than 2-12 bytes. * - __u8 flags; - Flags, indicating presence of other standard UVC fields * - __u8 buf[]; diff --git a/Documentation/userspace-api/media/v4l/pixfmt-bayer.rst b/Documentation/userspace-api/media/v4l/pixfmt-bayer.rst index ed3eb432967d..b5ca501842b0 100644 --- a/Documentation/userspace-api/media/v4l/pixfmt-bayer.rst +++ b/Documentation/userspace-api/media/v4l/pixfmt-bayer.rst @@ -19,6 +19,7 @@ orders. See also `the Wikipedia article on Bayer filter .. toctree:: :maxdepth: 1 + pixfmt-rawnn-cru pixfmt-srggb8 pixfmt-srggb8-pisp-comp pixfmt-srggb10 diff --git a/Documentation/userspace-api/media/v4l/pixfmt-rawnn-cru.rst b/Documentation/userspace-api/media/v4l/pixfmt-rawnn-cru.rst new file mode 100644 index 000000000000..db81f1cfe0f5 --- /dev/null +++ b/Documentation/userspace-api/media/v4l/pixfmt-rawnn-cru.rst @@ -0,0 +1,143 @@ +.. SPDX-License-Identifier: GFDL-1.1-no-invariants-or-later + +.. _v4l2-pix-fmt-raw-cru10: +.. _v4l2-pix-fmt-raw-cru12: +.. _v4l2-pix-fmt-raw-cru14: +.. _v4l2-pix-fmt-raw-cru20: + +********************************************************************************************************************************** +V4L2_PIX_FMT_RAW_CRU10 ('CR10'), V4L2_PIX_FMT_RAW_CRU12 ('CR12'), V4L2_PIX_FMT_RAW_CRU14 ('CR14'), V4L2_PIX_FMT_RAW_CRU20 ('CR20') +********************************************************************************************************************************** + +=============================================================== +Renesas RZ/V2H Camera Receiver Unit 64-bit packed pixel formats +=============================================================== + +| V4L2_PIX_FMT_RAW_CRU10 (CR10) +| V4L2_PIX_FMT_RAW_CRU12 (CR12) +| V4L2_PIX_FMT_RAW_CRU14 (CR14) +| V4L2_PIX_FMT_RAW_CRU20 (CR20) + +Description +=========== + +These pixel formats are some of the RAW outputs for the Camera Receiver Unit in +the Renesas RZ/V2H SoC. They are raw formats which pack pixels contiguously into +64-bit units, with the 4 or 8 most significant bits padded. + +**Byte Order** + +.. flat-table:: RAW formats + :header-rows: 2 + :stub-columns: 0 + :widths: 36 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 + :fill-cells: + + * - :rspan:`1` Pixel Format Code + - :cspan:`63` Data organization + * - 63 + - 62 + - 61 + - 60 + - 59 + - 58 + - 57 + - 56 + - 55 + - 54 + - 53 + - 52 + - 51 + - 50 + - 49 + - 48 + - 47 + - 46 + - 45 + - 44 + - 43 + - 42 + - 41 + - 40 + - 39 + - 38 + - 37 + - 36 + - 35 + - 34 + - 33 + - 32 + - 31 + - 30 + - 29 + - 28 + - 27 + - 26 + - 25 + - 24 + - 23 + - 22 + - 21 + - 20 + - 19 + - 18 + - 17 + - 16 + - 15 + - 14 + - 13 + - 12 + - 11 + - 10 + - 9 + - 8 + - 7 + - 6 + - 5 + - 4 + - 3 + - 2 + - 1 + - 0 + * - V4L2_PIX_FMT_RAW_CRU10 + - 0 + - 0 + - 0 + - 0 + - :cspan:`9` P5 + - :cspan:`9` P4 + - :cspan:`9` P3 + - :cspan:`9` P2 + - :cspan:`9` P1 + - :cspan:`9` P0 + * - V4L2_PIX_FMT_RAW_CRU12 + - 0 + - 0 + - 0 + - 0 + - :cspan:`11` P4 + - :cspan:`11` P3 + - :cspan:`11` P2 + - :cspan:`11` P1 + - :cspan:`11` P0 + * - V4L2_PIX_FMT_RAW_CRU14 + - 0 + - 0 + - 0 + - 0 + - 0 + - 0 + - 0 + - 0 + - :cspan:`13` P3 + - :cspan:`13` P2 + - :cspan:`13` P1 + - :cspan:`13` P0 + * - V4L2_PIX_FMT_RAW_CRU20 + - 0 + - 0 + - 0 + - 0 + - :cspan:`19` P2 + - :cspan:`19` P1 + - :cspan:`19` P0 diff --git a/Documentation/userspace-api/media/v4l/pixfmt-srggb12p.rst b/Documentation/userspace-api/media/v4l/pixfmt-srggb12p.rst index 7c3810ff783c..8c03aedcc00e 100644 --- a/Documentation/userspace-api/media/v4l/pixfmt-srggb12p.rst +++ b/Documentation/userspace-api/media/v4l/pixfmt-srggb12p.rst @@ -6,7 +6,7 @@ .. _v4l2-pix-fmt-sgrbg12p: ******************************************************************************************************************************* -V4L2_PIX_FMT_SRGGB12P ('pRCC'), V4L2_PIX_FMT_SGRBG12P ('pgCC'), V4L2_PIX_FMT_SGBRG12P ('pGCC'), V4L2_PIX_FMT_SBGGR12P ('pBCC'), +V4L2_PIX_FMT_SRGGB12P ('pRCC'), V4L2_PIX_FMT_SGRBG12P ('pgCC'), V4L2_PIX_FMT_SGBRG12P ('pGCC'), V4L2_PIX_FMT_SBGGR12P ('pBCC') ******************************************************************************************************************************* @@ -20,7 +20,7 @@ Description These four pixel formats are packed raw sRGB / Bayer formats with 12 bits per colour. Every two consecutive samples are packed into three bytes. Each of the first two bytes contain the 8 high order bits of -the pixels, and the third byte contains the four least significants +the pixels, and the third byte contains the four least significant bits of each pixel, in the same order. Each n-pixel row contains n/2 green samples and n/2 blue or red diff --git a/Documentation/userspace-api/media/v4l/pixfmt-srggb14p.rst b/Documentation/userspace-api/media/v4l/pixfmt-srggb14p.rst index 3572e42adb22..f4f53d7dbdeb 100644 --- a/Documentation/userspace-api/media/v4l/pixfmt-srggb14p.rst +++ b/Documentation/userspace-api/media/v4l/pixfmt-srggb14p.rst @@ -24,7 +24,7 @@ These four pixel formats are packed raw sRGB / Bayer formats with 14 bits per colour. Every four consecutive samples are packed into seven bytes. Each of the first four bytes contain the eight high order bits of the pixels, and the three following bytes contains the six least -significants bits of each pixel, in the same order. +significant bits of each pixel, in the same order. Each n-pixel row contains n/2 green samples and n/2 blue or red samples, with alternating green-red and green-blue rows. They are conventionally diff --git a/Documentation/userspace-api/media/v4l/pixfmt-y16i.rst b/Documentation/userspace-api/media/v4l/pixfmt-y16i.rst new file mode 100644 index 000000000000..74ba9e910a38 --- /dev/null +++ b/Documentation/userspace-api/media/v4l/pixfmt-y16i.rst @@ -0,0 +1,73 @@ +.. SPDX-License-Identifier: GFDL-1.1-no-invariants-or-later + +.. _V4L2-PIX-FMT-Y16I: + +************************** +V4L2_PIX_FMT_Y16I ('Y16I') +************************** + +Interleaved grey-scale image, e.g. from a stereo-pair + + +Description +=========== + +This is a grey-scale image with a depth of 16 bits per pixel, but with pixels +from 2 sources interleaved and unpacked. Each pixel is stored in a 16-bit word +in the little-endian order. The first pixel is from the left source. + +**Pixel unpacked representation.** +Left/Right pixels 16-bit unpacked - 16-bit for each interleaved pixel. + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + + * - Y'\ :sub:`0L[7:0]` + - Y'\ :sub:`0L[15:8]` + - Y'\ :sub:`0R[7:0]` + - Y'\ :sub:`0R[15:8]` + +**Byte Order.** +Each cell is one byte. + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + + * - start + 0: + - Y'\ :sub:`00Llow` + - Y'\ :sub:`00Lhigh` + - Y'\ :sub:`00Rlow` + - Y'\ :sub:`00Rhigh` + - Y'\ :sub:`01Llow` + - Y'\ :sub:`01Lhigh` + - Y'\ :sub:`01Rlow` + - Y'\ :sub:`01Rhigh` + * - start + 8: + - Y'\ :sub:`10Llow` + - Y'\ :sub:`10Lhigh` + - Y'\ :sub:`10Rlow` + - Y'\ :sub:`10Rhigh` + - Y'\ :sub:`11Llow` + - Y'\ :sub:`11Lhigh` + - Y'\ :sub:`11Rlow` + - Y'\ :sub:`11Rhigh` + * - start + 16: + - Y'\ :sub:`20Llow` + - Y'\ :sub:`20Lhigh` + - Y'\ :sub:`20Rlow` + - Y'\ :sub:`20Rhigh` + - Y'\ :sub:`21Llow` + - Y'\ :sub:`21Lhigh` + - Y'\ :sub:`21Rlow` + - Y'\ :sub:`21Rhigh` + * - start + 24: + - Y'\ :sub:`30Llow` + - Y'\ :sub:`30Lhigh` + - Y'\ :sub:`30Rlow` + - Y'\ :sub:`30Rhigh` + - Y'\ :sub:`31Llow` + - Y'\ :sub:`31Lhigh` + - Y'\ :sub:`31Rlow` + - Y'\ :sub:`31Rhigh` diff --git a/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst b/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst index b788f6933855..6e4f399f1f88 100644 --- a/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst +++ b/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst @@ -137,6 +137,13 @@ All components are stored with the same number of bits per component. - Cb, Cr - No - Linear + * - V4L2_PIX_FMT_NV15 + - 'NV15' + - 10 + - 4:2:0 + - Cb, Cr + - Yes + - Linear * - V4L2_PIX_FMT_NV15_4L4 - 'VT15' - 15 @@ -186,6 +193,13 @@ All components are stored with the same number of bits per component. - Cr, Cb - No - Linear + * - V4L2_PIX_FMT_NV20 + - 'NV20' + - 10 + - 4:2:2 + - Cb, Cr + - Yes + - Linear * - V4L2_PIX_FMT_NV24 - 'NV24' - 8 @@ -302,6 +316,57 @@ of the luma plane. - Cr\ :sub:`11` +.. _V4L2-PIX-FMT-NV15: + +NV15 +---- + +Semi-planar 10-bit YUV 4:2:0 format similar to NV12, using 10-bit components +with no padding between each component. A group of 4 components are stored over +5 bytes in little endian order. + +.. flat-table:: Sample 4x4 NV15 Image (1 byte per cell) + :header-rows: 0 + :stub-columns: 0 + + * - start + 0: + - Y'\ :sub:`00[7:0]` + - Y'\ :sub:`01[5:0]`\ Y'\ :sub:`00[9:8]` + - Y'\ :sub:`02[3:0]`\ Y'\ :sub:`01[9:6]` + - Y'\ :sub:`03[1:0]`\ Y'\ :sub:`02[9:4]` + - Y'\ :sub:`03[9:2]` + * - start + 5: + - Y'\ :sub:`10[7:0]` + - Y'\ :sub:`11[5:0]`\ Y'\ :sub:`10[9:8]` + - Y'\ :sub:`12[3:0]`\ Y'\ :sub:`11[9:6]` + - Y'\ :sub:`13[1:0]`\ Y'\ :sub:`12[9:4]` + - Y'\ :sub:`13[9:2]` + * - start + 10: + - Y'\ :sub:`20[7:0]` + - Y'\ :sub:`21[5:0]`\ Y'\ :sub:`20[9:8]` + - Y'\ :sub:`22[3:0]`\ Y'\ :sub:`21[9:6]` + - Y'\ :sub:`23[1:0]`\ Y'\ :sub:`22[9:4]` + - Y'\ :sub:`23[9:2]` + * - start + 15: + - Y'\ :sub:`30[7:0]` + - Y'\ :sub:`31[5:0]`\ Y'\ :sub:`30[9:8]` + - Y'\ :sub:`32[3:0]`\ Y'\ :sub:`31[9:6]` + - Y'\ :sub:`33[1:0]`\ Y'\ :sub:`32[9:4]` + - Y'\ :sub:`33[9:2]` + * - start + 20: + - Cb\ :sub:`00[7:0]` + - Cr\ :sub:`00[5:0]`\ Cb\ :sub:`00[9:8]` + - Cb\ :sub:`01[3:0]`\ Cr\ :sub:`00[9:6]` + - Cr\ :sub:`01[1:0]`\ Cb\ :sub:`01[9:4]` + - Cr\ :sub:`01[9:2]` + * - start + 25: + - Cb\ :sub:`10[7:0]` + - Cr\ :sub:`10[5:0]`\ Cb\ :sub:`10[9:8]` + - Cb\ :sub:`11[3:0]`\ Cr\ :sub:`10[9:6]` + - Cr\ :sub:`11[1:0]`\ Cb\ :sub:`11[9:4]` + - Cr\ :sub:`11[9:2]` + + .. _V4L2-PIX-FMT-NV12MT: .. _V4L2-PIX-FMT-NV12MT-16X16: .. _V4L2-PIX-FMT-NV12-4L4: @@ -631,6 +696,69 @@ number of lines as the luma plane. - Cr\ :sub:`32` +.. _V4L2-PIX-FMT-NV20: + +NV20 +---- + +Semi-planar 10-bit YUV 4:2:2 format similar to NV16, using 10-bit components +with no padding between each component. A group of 4 components are stored over +5 bytes in little endian order. + +.. flat-table:: Sample 4x4 NV20 Image (1 byte per cell) + :header-rows: 0 + :stub-columns: 0 + + * - start + 0: + - Y'\ :sub:`00[7:0]` + - Y'\ :sub:`01[5:0]`\ Y'\ :sub:`00[9:8]` + - Y'\ :sub:`02[3:0]`\ Y'\ :sub:`01[9:6]` + - Y'\ :sub:`03[1:0]`\ Y'\ :sub:`02[9:4]` + - Y'\ :sub:`03[9:2]` + * - start + 5: + - Y'\ :sub:`10[7:0]` + - Y'\ :sub:`11[5:0]`\ Y'\ :sub:`10[9:8]` + - Y'\ :sub:`12[3:0]`\ Y'\ :sub:`11[9:6]` + - Y'\ :sub:`13[1:0]`\ Y'\ :sub:`12[9:4]` + - Y'\ :sub:`13[9:2]` + * - start + 10: + - Y'\ :sub:`20[7:0]` + - Y'\ :sub:`21[5:0]`\ Y'\ :sub:`20[9:8]` + - Y'\ :sub:`22[3:0]`\ Y'\ :sub:`21[9:6]` + - Y'\ :sub:`23[1:0]`\ Y'\ :sub:`22[9:4]` + - Y'\ :sub:`23[9:2]` + * - start + 15: + - Y'\ :sub:`30[7:0]` + - Y'\ :sub:`31[5:0]`\ Y'\ :sub:`30[9:8]` + - Y'\ :sub:`32[3:0]`\ Y'\ :sub:`31[9:6]` + - Y'\ :sub:`33[1:0]`\ Y'\ :sub:`32[9:4]` + - Y'\ :sub:`33[9:2]` + * - start + 20: + - Cb\ :sub:`00[7:0]` + - Cr\ :sub:`00[5:0]`\ Cb\ :sub:`00[9:8]` + - Cb\ :sub:`01[3:0]`\ Cr\ :sub:`00[9:6]` + - Cr\ :sub:`01[1:0]`\ Cb\ :sub:`01[9:4]` + - Cr\ :sub:`01[9:2]` + * - start + 25: + - Cb\ :sub:`10[7:0]` + - Cr\ :sub:`10[5:0]`\ Cb\ :sub:`10[9:8]` + - Cb\ :sub:`11[3:0]`\ Cr\ :sub:`10[9:6]` + - Cr\ :sub:`11[1:0]`\ Cb\ :sub:`11[9:4]` + - Cr\ :sub:`11[9:2]` + * - start + 30: + - Cb\ :sub:`20[7:0]` + - Cr\ :sub:`20[5:0]`\ Cb\ :sub:`20[9:8]` + - Cb\ :sub:`21[3:0]`\ Cr\ :sub:`20[9:6]` + - Cr\ :sub:`21[1:0]`\ Cb\ :sub:`21[9:4]` + - Cr\ :sub:`21[9:2]` + * - start + 35: + - Cb\ :sub:`30[7:0]` + - Cr\ :sub:`30[5:0]`\ Cb\ :sub:`30[9:8]` + - Cb\ :sub:`31[3:0]`\ Cr\ :sub:`30[9:6]` + - Cr\ :sub:`31[1:0]`\ Cb\ :sub:`31[9:4]` + - Cr\ :sub:`31[9:2]` + + .. _V4L2-PIX-FMT-NV24: .. _V4L2-PIX-FMT-NV42: diff --git a/Documentation/userspace-api/media/v4l/subdev-formats.rst b/Documentation/userspace-api/media/v4l/subdev-formats.rst index d2a6cd2e1eb2..2a94371448dc 100644 --- a/Documentation/userspace-api/media/v4l/subdev-formats.rst +++ b/Documentation/userspace-api/media/v4l/subdev-formats.rst @@ -2225,7 +2225,7 @@ The following table list existing packed 48bit wide RGB formats. \endgroup On LVDS buses, usually each sample is transferred serialized in seven -time slots per pixel clock, on three (18-bit) or four (24-bit) +time slots per pixel clock, on three (18-bit) or four (24-bit) or five (30-bit) differential data pairs at the same time. The remaining bits are used for control signals as defined by SPWG/PSWG/VESA or JEIDA standards. The 24-bit RGB format serialized in seven time slots on four lanes using @@ -2246,11 +2246,12 @@ JEIDA defined bit mapping will be named - Code - - - - :cspan:`3` Data organization + - :cspan:`4` Data organization * - - - Timeslot - Lane + - 4 - 3 - 2 - 1 @@ -2262,6 +2263,7 @@ JEIDA defined bit mapping will be named - 0 - - + - - d - b\ :sub:`1` - g\ :sub:`0` @@ -2270,6 +2272,7 @@ JEIDA defined bit mapping will be named - 1 - - + - - d - b\ :sub:`0` - r\ :sub:`5` @@ -2278,6 +2281,7 @@ JEIDA defined bit mapping will be named - 2 - - + - - d - g\ :sub:`5` - r\ :sub:`4` @@ -2286,6 +2290,7 @@ JEIDA defined bit mapping will be named - 3 - - + - - b\ :sub:`5` - g\ :sub:`4` - r\ :sub:`3` @@ -2294,6 +2299,7 @@ JEIDA defined bit mapping will be named - 4 - - + - - b\ :sub:`4` - g\ :sub:`3` - r\ :sub:`2` @@ -2302,6 +2308,7 @@ JEIDA defined bit mapping will be named - 5 - - + - - b\ :sub:`3` - g\ :sub:`2` - r\ :sub:`1` @@ -2310,6 +2317,7 @@ JEIDA defined bit mapping will be named - 6 - - + - - b\ :sub:`2` - g\ :sub:`1` - r\ :sub:`0` @@ -2319,6 +2327,7 @@ JEIDA defined bit mapping will be named - 0x1011 - 0 - + - - d - d - b\ :sub:`1` @@ -2327,6 +2336,7 @@ JEIDA defined bit mapping will be named - - 1 - + - - b\ :sub:`7` - d - b\ :sub:`0` @@ -2335,6 +2345,7 @@ JEIDA defined bit mapping will be named - - 2 - + - - b\ :sub:`6` - d - g\ :sub:`5` @@ -2343,6 +2354,7 @@ JEIDA defined bit mapping will be named - - 3 - + - - g\ :sub:`7` - b\ :sub:`5` - g\ :sub:`4` @@ -2351,6 +2363,7 @@ JEIDA defined bit mapping will be named - - 4 - + - - g\ :sub:`6` - b\ :sub:`4` - g\ :sub:`3` @@ -2359,6 +2372,7 @@ JEIDA defined bit mapping will be named - - 5 - + - - r\ :sub:`7` - b\ :sub:`3` - g\ :sub:`2` @@ -2367,6 +2381,7 @@ JEIDA defined bit mapping will be named - - 6 - + - - r\ :sub:`6` - b\ :sub:`2` - g\ :sub:`1` @@ -2377,6 +2392,7 @@ JEIDA defined bit mapping will be named - 0x1012 - 0 - + - - d - d - b\ :sub:`3` @@ -2385,6 +2401,7 @@ JEIDA defined bit mapping will be named - - 1 - + - - b\ :sub:`1` - d - b\ :sub:`2` @@ -2393,6 +2410,7 @@ JEIDA defined bit mapping will be named - - 2 - + - - b\ :sub:`0` - d - g\ :sub:`7` @@ -2401,6 +2419,7 @@ JEIDA defined bit mapping will be named - - 3 - + - - g\ :sub:`1` - b\ :sub:`7` - g\ :sub:`6` @@ -2409,6 +2428,7 @@ JEIDA defined bit mapping will be named - - 4 - + - - g\ :sub:`0` - b\ :sub:`6` - g\ :sub:`5` @@ -2417,6 +2437,7 @@ JEIDA defined bit mapping will be named - - 5 - + - - r\ :sub:`1` - b\ :sub:`5` - g\ :sub:`4` @@ -2425,10 +2446,141 @@ JEIDA defined bit mapping will be named - - 6 - + - + - r\ :sub:`0` + - b\ :sub:`4` + - g\ :sub:`3` + - r\ :sub:`2` + * .. _MEDIA-BUS-FMT-RGB101010-1X7X5-SPWG: + + - MEDIA_BUS_FMT_RGB101010_1X7X5_SPWG + - 0x1026 + - 0 + - + - d + - d + - d + - b\ :sub:`1` + - g\ :sub:`0` + * - + - + - 1 + - + - b\ :sub:`9` + - b\ :sub:`7` + - d + - b\ :sub:`0` + - r\ :sub:`5` + * - + - + - 2 + - + - b\ :sub:`8` + - b\ :sub:`6` + - d + - g\ :sub:`5` + - r\ :sub:`4` + * - + - + - 3 + - + - g\ :sub:`9` + - g\ :sub:`7` + - b\ :sub:`5` + - g\ :sub:`4` + - r\ :sub:`3` + * - + - + - 4 + - + - g\ :sub:`8` + - g\ :sub:`6` + - b\ :sub:`4` + - g\ :sub:`3` + - r\ :sub:`2` + * - + - + - 5 + - + - r\ :sub:`9` + - r\ :sub:`7` + - b\ :sub:`3` + - g\ :sub:`2` + - r\ :sub:`1` + * - + - + - 6 + - + - r\ :sub:`8` + - r\ :sub:`6` + - b\ :sub:`2` + - g\ :sub:`1` - r\ :sub:`0` + * .. _MEDIA-BUS-FMT-RGB101010-1X7X5-JEIDA: + + - MEDIA_BUS_FMT_RGB101010_1X7X5_JEIDA + - 0x1027 + - 0 + - + - d + - d + - d + - b\ :sub:`5` + - g\ :sub:`4` + * - + - + - 1 + - + - b\ :sub:`1` + - b\ :sub:`3` + - d - b\ :sub:`4` + - r\ :sub:`9` + * - + - + - 2 + - + - b\ :sub:`0` + - b\ :sub:`2` + - d + - g\ :sub:`9` + - r\ :sub:`8` + * - + - + - 3 + - + - g\ :sub:`1` - g\ :sub:`3` + - b\ :sub:`9` + - g\ :sub:`8` + - r\ :sub:`7` + * - + - + - 4 + - + - g\ :sub:`0` + - g\ :sub:`2` + - b\ :sub:`8` + - g\ :sub:`7` + - r\ :sub:`6` + * - + - + - 5 + - + - r\ :sub:`1` + - r\ :sub:`3` + - b\ :sub:`7` + - g\ :sub:`6` + - r\ :sub:`5` + * - + - + - 6 + - + - r\ :sub:`0` - r\ :sub:`2` + - b\ :sub:`6` + - g\ :sub:`5` + - r\ :sub:`4` .. raw:: latex diff --git a/Documentation/userspace-api/media/v4l/vidioc-enum-fmt.rst b/Documentation/userspace-api/media/v4l/vidioc-enum-fmt.rst index 3adb3d205531..0f69aa04607f 100644 --- a/Documentation/userspace-api/media/v4l/vidioc-enum-fmt.rst +++ b/Documentation/userspace-api/media/v4l/vidioc-enum-fmt.rst @@ -85,7 +85,17 @@ the ``mbus_code`` field is handled differently: * - __u32 - ``index`` - Number of the format in the enumeration, set by the application. - This is in no way related to the ``pixelformat`` field. + This is in no way related to the ``pixelformat`` field. + When the index is ORed with ``V4L2_FMTDESC_FLAG_ENUM_ALL`` the + driver clears the flag and enumerates all the possible formats, + ignoring any limitations from the current configuration. Drivers + which do not support this flag always return an ``EINVAL`` + error code without clearing this flag. + Formats enumerated when using ``V4L2_FMTDESC_FLAG_ENUM_ALL`` flag + shouldn't be used when calling :c:func:`VIDIOC_ENUM_FRAMESIZES` + or :c:func:`VIDIOC_ENUM_FRAMEINTERVALS`. + ``V4L2_FMTDESC_FLAG_ENUM_ALL`` should only be used by drivers that + can return different format list depending on this flag. * - __u32 - ``type`` - Type of the data stream, set by the application. Only these types @@ -234,6 +244,12 @@ the ``mbus_code`` field is handled differently: valid. The buffer consists of ``height`` lines, each having ``width`` Data Units of data and the offset (in bytes) between the beginning of each two consecutive lines is ``bytesperline``. + * - ``V4L2_FMTDESC_FLAG_ENUM_ALL`` + - 0x80000000 + - When the applications ORs ``index`` with ``V4L2_FMTDESC_FLAG_ENUM_ALL`` flag + the driver enumerates all the possible pixel formats without taking care + of any already set configuration. Drivers which do not support this flag, + always return ``EINVAL`` without clearing this flag. Return Value ============ diff --git a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst index 4d56c0528ad7..b8698b85bd80 100644 --- a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst +++ b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst @@ -199,6 +199,10 @@ still cause this situation. - ``p_area`` - A pointer to a struct :c:type:`v4l2_area`. Valid if this control is of type ``V4L2_CTRL_TYPE_AREA``. + * - struct :c:type:`v4l2_rect` * + - ``p_rect`` + - A pointer to a struct :c:type:`v4l2_rect`. Valid if this control is + of type ``V4L2_CTRL_TYPE_RECT``. * - struct :c:type:`v4l2_ctrl_h264_sps` * - ``p_h264_sps`` - A pointer to a struct :c:type:`v4l2_ctrl_h264_sps`. Valid if this control is @@ -334,14 +338,26 @@ still cause this situation. - Which value of the control to get/set/try. * - :cspan:`2` ``V4L2_CTRL_WHICH_CUR_VAL`` will return the current value of the control, ``V4L2_CTRL_WHICH_DEF_VAL`` will return the default - value of the control and ``V4L2_CTRL_WHICH_REQUEST_VAL`` indicates that - these controls have to be retrieved from a request or tried/set for - a request. In the latter case the ``request_fd`` field contains the + value of the control, ``V4L2_CTRL_WHICH_MIN_VAL`` will return the minimum + value of the control, and ``V4L2_CTRL_WHICH_MAX_VAL`` will return the maximum + value of the control. ``V4L2_CTRL_WHICH_REQUEST_VAL`` indicates that + the control value has to be retrieved from a request or tried/set for + a request. In that case the ``request_fd`` field contains the file descriptor of the request that should be used. If the device does not support requests, then ``EACCES`` will be returned. - When using ``V4L2_CTRL_WHICH_DEF_VAL`` be aware that you can only - get the default value of the control, you cannot set or try it. + When using ``V4L2_CTRL_WHICH_DEF_VAL``, ``V4L2_CTRL_WHICH_MIN_VAL`` + or ``V4L2_CTRL_WHICH_MAX_VAL`` be aware that you can only get the + default/minimum/maximum value of the control, you cannot set or try it. + + Whether a control supports querying the minimum and maximum values using + ``V4L2_CTRL_WHICH_MIN_VAL`` and ``V4L2_CTRL_WHICH_MAX_VAL`` is indicated + by the ``V4L2_CTRL_FLAG_HAS_WHICH_MIN_MAX`` flag. Most non-compound + control types support this. For controls with compound types, the + definition of minimum/maximum values are provided by + the control documentation. If a compound control does not document the + meaning of minimum/maximum value, then querying the minimum or maximum + value will result in the error code -EINVAL. For backwards compatibility you can also use a control class here (see :ref:`ctrl-class`). In that case all controls have to diff --git a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst index 4d38acafe8e1..3549417c7feb 100644 --- a/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst +++ b/Documentation/userspace-api/media/v4l/vidioc-queryctrl.rst @@ -441,6 +441,16 @@ See also the examples in :ref:`control`. - n/a - A struct :c:type:`v4l2_area`, containing the width and the height of a rectangular area. Units depend on the use case. + * - ``V4L2_CTRL_TYPE_RECT`` + - n/a + - n/a + - n/a + - A struct :c:type:`v4l2_rect`, containing a rectangle described by + the position of its top-left corner, the width and the height. Units + depend on the use case. Support for ``V4L2_CTRL_WHICH_MIN_VAL`` and + ``V4L2_CTRL_WHICH_MAX_VAL`` is optional and depends on the + ``V4L2_CTRL_FLAG_HAS_WHICH_MIN_MAX`` flag. See the documentation of + the specific control on how to interpret the minimum and maximum values. * - ``V4L2_CTRL_TYPE_H264_SPS`` - n/a - n/a @@ -657,6 +667,10 @@ See also the examples in :ref:`control`. ``dims[0]``. So setting the control with a differently sized array will change the ``elems`` field when the control is queried afterwards. + * - ``V4L2_CTRL_FLAG_HAS_WHICH_MIN_MAX`` + - 0x1000 + - This control supports getting minimum and maximum values using + vidioc_g_ext_ctrls with V4L2_CTRL_WHICH_MIN/MAX_VAL. Return Value ============ diff --git a/Documentation/userspace-api/media/v4l/yuv-formats.rst b/Documentation/userspace-api/media/v4l/yuv-formats.rst index 24b34cdfa6fe..78ee406d7647 100644 --- a/Documentation/userspace-api/media/v4l/yuv-formats.rst +++ b/Documentation/userspace-api/media/v4l/yuv-formats.rst @@ -269,5 +269,6 @@ image. pixfmt-yuv-luma pixfmt-y8i pixfmt-y12i + pixfmt-y16i pixfmt-uv8 pixfmt-m420 diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/videodev2.h.rst.exceptions index d67fd4038d22..35d3456cc812 100644 --- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions +++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions @@ -150,6 +150,7 @@ replace symbol V4L2_CTRL_TYPE_HEVC_SPS :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_HEVC_PPS :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_AREA :c:type:`v4l2_ctrl_type` +replace symbol V4L2_CTRL_TYPE_RECT :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_FWHT_PARAMS :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_VP8_FRAME :c:type:`v4l2_ctrl_type` replace symbol V4L2_CTRL_TYPE_VP9_COMPRESSED_HDR :c:type:`v4l2_ctrl_type` @@ -217,6 +218,7 @@ replace define V4L2_FMT_FLAG_CSC_YCBCR_ENC fmtdesc-flags replace define V4L2_FMT_FLAG_CSC_HSV_ENC fmtdesc-flags replace define V4L2_FMT_FLAG_CSC_QUANTIZATION fmtdesc-flags replace define V4L2_FMT_FLAG_META_LINE_BASED fmtdesc-flags +replace define V4L2_FMTDESC_FLAG_ENUM_ALL fmtdesc-flags # V4L2 timecode types replace define V4L2_TC_TYPE_24FPS timecode-type @@ -394,6 +396,7 @@ replace define V4L2_CTRL_FLAG_HAS_PAYLOAD control-flags replace define V4L2_CTRL_FLAG_EXECUTE_ON_WRITE control-flags replace define V4L2_CTRL_FLAG_MODIFY_LAYOUT control-flags replace define V4L2_CTRL_FLAG_DYNAMIC_ARRAY control-flags +replace define V4L2_CTRL_FLAG_HAS_WHICH_MIN_MAX control-flags replace define V4L2_CTRL_FLAG_NEXT_CTRL control replace define V4L2_CTRL_FLAG_NEXT_COMPOUND control @@ -568,6 +571,8 @@ ignore define V4L2_CTRL_DRIVER_PRIV ignore define V4L2_CTRL_MAX_DIMS ignore define V4L2_CTRL_WHICH_CUR_VAL ignore define V4L2_CTRL_WHICH_DEF_VAL +ignore define V4L2_CTRL_WHICH_MIN_VAL +ignore define V4L2_CTRL_WHICH_MAX_VAL ignore define V4L2_CTRL_WHICH_REQUEST_VAL ignore define V4L2_OUT_CAP_CUSTOM_TIMINGS ignore define V4L2_CID_MAX_CTRLS diff --git a/Documentation/userspace-api/mseal.rst b/Documentation/userspace-api/mseal.rst index 41102f74c5e2..ea9b11a0bd89 100644 --- a/Documentation/userspace-api/mseal.rst +++ b/Documentation/userspace-api/mseal.rst @@ -27,7 +27,7 @@ SYSCALL ======= mseal syscall signature ----------------------- - ``int mseal(void \* addr, size_t len, unsigned long flags)`` + ``int mseal(void *addr, size_t len, unsigned long flags)`` **addr**/**len**: virtual memory address range. The address range set by **addr**/**len** must meet: @@ -130,6 +130,27 @@ Use cases - Chrome browser: protect some security sensitive data structures. +- System mappings: + The system mappings are created by the kernel and includes vdso, vvar, + vvar_vclock, vectors (arm compat-mode), sigpage (arm compat-mode), uprobes. + + Those system mappings are readonly only or execute only, memory sealing can + protect them from ever changing to writable or unmmap/remapped as different + attributes. This is useful to mitigate memory corruption issues where a + corrupted pointer is passed to a memory management system. + + If supported by an architecture (CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS), + the CONFIG_MSEAL_SYSTEM_MAPPINGS seals all system mappings of this + architecture. + + The following architectures currently support this feature: x86-64, arm64, + loongarch and s390. + + WARNING: This feature breaks programs which rely on relocating + or unmapping system mappings. Known broken software at the time + of writing includes CHECKPOINT_RESTORE, UML, gVisor, rr. Therefore + this config can't be enabled universally. + When not to use mseal ===================== Applications can apply sealing to any virtual memory region from userspace, diff --git a/Documentation/userspace-api/netlink/c-code-gen.rst b/Documentation/userspace-api/netlink/c-code-gen.rst index 89de42c13350..46415e6d646d 100644 --- a/Documentation/userspace-api/netlink/c-code-gen.rst +++ b/Documentation/userspace-api/netlink/c-code-gen.rst @@ -56,7 +56,9 @@ If ``name-prefix`` is specified it replaces the ``$family-$enum`` portion of the entry name. Boolean ``render-max`` controls creation of the max values -(which are enabled by default for attribute enums). +(which are enabled by default for attribute enums). These max +values are named ``__$pfx-MAX`` and ``$pfx-MAX``. The name +of the first value can be overridden via ``enum-cnt-name`` property. Attributes ========== diff --git a/Documentation/userspace-api/netlink/intro-specs.rst b/Documentation/userspace-api/netlink/intro-specs.rst index bada89699455..a4435ae4628d 100644 --- a/Documentation/userspace-api/netlink/intro-specs.rst +++ b/Documentation/userspace-api/netlink/intro-specs.rst @@ -15,7 +15,7 @@ developing Netlink related code. The tool is implemented in Python and can use a YAML specification to issue Netlink requests to the kernel. Only Generic Netlink is supported. -The tool is located at ``tools/net/ynl/cli.py``. It accepts +The tool is located at ``tools/net/ynl/pyynl/cli.py``. It accepts a handul of arguments, the most important ones are: - ``--spec`` - point to the spec file @@ -27,7 +27,7 @@ YAML specs can be found under ``Documentation/netlink/specs/``. Example use:: - $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/ethtool.yaml \ + $ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/ethtool.yaml \ --do rings-get \ --json '{"header":{"dev-index": 18}}' {'header': {'dev-index': 18, 'dev-name': 'eni1np1'}, @@ -75,7 +75,7 @@ the two marker lines like above to a file, add that file to git, and run the regeneration tool. Grep the tree for ``YNL-GEN`` to see other examples. -The code generation itself is performed by ``tools/net/ynl/ynl-gen-c.py`` +The code generation itself is performed by ``tools/net/ynl/pyynl/ynl_gen_c.py`` but it takes a few arguments so calling it directly for each file quickly becomes tedious. @@ -84,7 +84,7 @@ YNL lib ``tools/net/ynl/lib/`` contains an implementation of a C library (based on libmnl) which integrates with code generated by -``tools/net/ynl/ynl-gen-c.py`` to create easy to use netlink wrappers. +``tools/net/ynl/pyynl/ynl_gen_c.py`` to create easy to use netlink wrappers. YNL basics ---------- diff --git a/Documentation/userspace-api/netlink/netlink-raw.rst b/Documentation/userspace-api/netlink/netlink-raw.rst index 1990eea772d0..31fc91020eb3 100644 --- a/Documentation/userspace-api/netlink/netlink-raw.rst +++ b/Documentation/userspace-api/netlink/netlink-raw.rst @@ -62,7 +62,7 @@ Sub-messages ------------ Several raw netlink families such as -:doc:`rt_link<../../networking/netlink_spec/rt_link>` and +:doc:`rt-link<../../networking/netlink_spec/rt-link>` and :doc:`tc<../../networking/netlink_spec/tc>` use attribute nesting as an abstraction to carry module specific information. diff --git a/Documentation/userspace-api/ntsync.rst b/Documentation/userspace-api/ntsync.rst new file mode 100644 index 000000000000..25e7c4aef968 --- /dev/null +++ b/Documentation/userspace-api/ntsync.rst @@ -0,0 +1,385 @@ +=================================== +NT synchronization primitive driver +=================================== + +This page documents the user-space API for the ntsync driver. + +ntsync is a support driver for emulation of NT synchronization +primitives by user-space NT emulators. It exists because implementation +in user-space, using existing tools, cannot match Windows performance +while offering accurate semantics. It is implemented entirely in +software, and does not drive any hardware device. + +This interface is meant as a compatibility tool only, and should not +be used for general synchronization. Instead use generic, versatile +interfaces such as futex(2) and poll(2). + +Synchronization primitives +========================== + +The ntsync driver exposes three types of synchronization primitives: +semaphores, mutexes, and events. + +A semaphore holds a single volatile 32-bit counter, and a static 32-bit +integer denoting the maximum value. It is considered signaled (that is, +can be acquired without contention, or will wake up a waiting thread) +when the counter is nonzero. The counter is decremented by one when a +wait is satisfied. Both the initial and maximum count are established +when the semaphore is created. + +A mutex holds a volatile 32-bit recursion count, and a volatile 32-bit +identifier denoting its owner. A mutex is considered signaled when its +owner is zero (indicating that it is not owned). The recursion count is +incremented when a wait is satisfied, and ownership is set to the given +identifier. + +A mutex also holds an internal flag denoting whether its previous owner +has died; such a mutex is said to be abandoned. Owner death is not +tracked automatically based on thread death, but rather must be +communicated using ``NTSYNC_IOC_MUTEX_KILL``. An abandoned mutex is +inherently considered unowned. + +Except for the "unowned" semantics of zero, the actual value of the +owner identifier is not interpreted by the ntsync driver at all. The +intended use is to store a thread identifier; however, the ntsync +driver does not actually validate that a calling thread provides +consistent or unique identifiers. + +An event is similar to a semaphore with a maximum count of one. It holds +a volatile boolean state denoting whether it is signaled or not. There +are two types of events, auto-reset and manual-reset. An auto-reset +event is designaled when a wait is satisfied; a manual-reset event is +not. The event type is specified when the event is created. + +Unless specified otherwise, all operations on an object are atomic and +totally ordered with respect to other operations on the same object. + +Objects are represented by files. When all file descriptors to an +object are closed, that object is deleted. + +Char device +=========== + +The ntsync driver creates a single char device /dev/ntsync. Each file +description opened on the device represents a unique instance intended +to back an individual NT virtual machine. Objects created by one ntsync +instance may only be used with other objects created by the same +instance. + +ioctl reference +=============== + +All operations on the device are done through ioctls. There are four +structures used in ioctl calls:: + + struct ntsync_sem_args { + __u32 count; + __u32 max; + }; + + struct ntsync_mutex_args { + __u32 owner; + __u32 count; + }; + + struct ntsync_event_args { + __u32 signaled; + __u32 manual; + }; + + struct ntsync_wait_args { + __u64 timeout; + __u64 objs; + __u32 count; + __u32 owner; + __u32 index; + __u32 alert; + __u32 flags; + __u32 pad; + }; + +Depending on the ioctl, members of the structure may be used as input, +output, or not at all. + +The ioctls on the device file are as follows: + +.. c:macro:: NTSYNC_IOC_CREATE_SEM + + Create a semaphore object. Takes a pointer to struct + :c:type:`ntsync_sem_args`, which is used as follows: + + .. list-table:: + + * - ``count`` + - Initial count of the semaphore. + * - ``max`` + - Maximum count of the semaphore. + + Fails with ``EINVAL`` if ``count`` is greater than ``max``. + On success, returns a file descriptor the created semaphore. + +.. c:macro:: NTSYNC_IOC_CREATE_MUTEX + + Create a mutex object. Takes a pointer to struct + :c:type:`ntsync_mutex_args`, which is used as follows: + + .. list-table:: + + * - ``count`` + - Initial recursion count of the mutex. + * - ``owner`` + - Initial owner of the mutex. + + If ``owner`` is nonzero and ``count`` is zero, or if ``owner`` is + zero and ``count`` is nonzero, the function fails with ``EINVAL``. + On success, returns a file descriptor the created mutex. + +.. c:macro:: NTSYNC_IOC_CREATE_EVENT + + Create an event object. Takes a pointer to struct + :c:type:`ntsync_event_args`, which is used as follows: + + .. list-table:: + + * - ``signaled`` + - If nonzero, the event is initially signaled, otherwise + nonsignaled. + * - ``manual`` + - If nonzero, the event is a manual-reset event, otherwise + auto-reset. + + On success, returns a file descriptor the created event. + +The ioctls on the individual objects are as follows: + +.. c:macro:: NTSYNC_IOC_SEM_POST + + Post to a semaphore object. Takes a pointer to a 32-bit integer, + which on input holds the count to be added to the semaphore, and on + output contains its previous count. + + If adding to the semaphore's current count would raise the latter + past the semaphore's maximum count, the ioctl fails with + ``EOVERFLOW`` and the semaphore is not affected. If raising the + semaphore's count causes it to become signaled, eligible threads + waiting on this semaphore will be woken and the semaphore's count + decremented appropriately. + +.. c:macro:: NTSYNC_IOC_MUTEX_UNLOCK + + Release a mutex object. Takes a pointer to struct + :c:type:`ntsync_mutex_args`, which is used as follows: + + .. list-table:: + + * - ``owner`` + - Specifies the owner trying to release this mutex. + * - ``count`` + - On output, contains the previous recursion count. + + If ``owner`` is zero, the ioctl fails with ``EINVAL``. If ``owner`` + is not the current owner of the mutex, the ioctl fails with + ``EPERM``. + + The mutex's count will be decremented by one. If decrementing the + mutex's count causes it to become zero, the mutex is marked as + unowned and signaled, and eligible threads waiting on it will be + woken as appropriate. + +.. c:macro:: NTSYNC_IOC_SET_EVENT + + Signal an event object. Takes a pointer to a 32-bit integer, which on + output contains the previous state of the event. + + Eligible threads will be woken, and auto-reset events will be + designaled appropriately. + +.. c:macro:: NTSYNC_IOC_RESET_EVENT + + Designal an event object. Takes a pointer to a 32-bit integer, which + on output contains the previous state of the event. + +.. c:macro:: NTSYNC_IOC_PULSE_EVENT + + Wake threads waiting on an event object while leaving it in an + unsignaled state. Takes a pointer to a 32-bit integer, which on + output contains the previous state of the event. + + A pulse operation can be thought of as a set followed by a reset, + performed as a single atomic operation. If two threads are waiting on + an auto-reset event which is pulsed, only one will be woken. If two + threads are waiting a manual-reset event which is pulsed, both will + be woken. However, in both cases, the event will be unsignaled + afterwards, and a simultaneous read operation will always report the + event as unsignaled. + +.. c:macro:: NTSYNC_IOC_READ_SEM + + Read the current state of a semaphore object. Takes a pointer to + struct :c:type:`ntsync_sem_args`, which is used as follows: + + .. list-table:: + + * - ``count`` + - On output, contains the current count of the semaphore. + * - ``max`` + - On output, contains the maximum count of the semaphore. + +.. c:macro:: NTSYNC_IOC_READ_MUTEX + + Read the current state of a mutex object. Takes a pointer to struct + :c:type:`ntsync_mutex_args`, which is used as follows: + + .. list-table:: + + * - ``owner`` + - On output, contains the current owner of the mutex, or zero + if the mutex is not currently owned. + * - ``count`` + - On output, contains the current recursion count of the mutex. + + If the mutex is marked as abandoned, the function fails with + ``EOWNERDEAD``. In this case, ``count`` and ``owner`` are set to + zero. + +.. c:macro:: NTSYNC_IOC_READ_EVENT + + Read the current state of an event object. Takes a pointer to struct + :c:type:`ntsync_event_args`, which is used as follows: + + .. list-table:: + + * - ``signaled`` + - On output, contains the current state of the event. + * - ``manual`` + - On output, contains 1 if the event is a manual-reset event, + and 0 otherwise. + +.. c:macro:: NTSYNC_IOC_KILL_OWNER + + Mark a mutex as unowned and abandoned if it is owned by the given + owner. Takes an input-only pointer to a 32-bit integer denoting the + owner. If the owner is zero, the ioctl fails with ``EINVAL``. If the + owner does not own the mutex, the function fails with ``EPERM``. + + Eligible threads waiting on the mutex will be woken as appropriate + (and such waits will fail with ``EOWNERDEAD``, as described below). + +.. c:macro:: NTSYNC_IOC_WAIT_ANY + + Poll on any of a list of objects, atomically acquiring at most one. + Takes a pointer to struct :c:type:`ntsync_wait_args`, which is + used as follows: + + .. list-table:: + + * - ``timeout`` + - Absolute timeout in nanoseconds. If ``NTSYNC_WAIT_REALTIME`` + is set, the timeout is measured against the REALTIME clock; + otherwise it is measured against the MONOTONIC clock. If the + timeout is equal to or earlier than the current time, the + function returns immediately without sleeping. If ``timeout`` + is U64_MAX, the function will sleep until an object is + signaled, and will not fail with ``ETIMEDOUT``. + * - ``objs`` + - Pointer to an array of ``count`` file descriptors + (specified as an integer so that the structure has the same + size regardless of architecture). If any object is + invalid, the function fails with ``EINVAL``. + * - ``count`` + - Number of objects specified in the ``objs`` array. + If greater than ``NTSYNC_MAX_WAIT_COUNT``, the function fails + with ``EINVAL``. + * - ``owner`` + - Mutex owner identifier. If any object in ``objs`` is a mutex, + the ioctl will attempt to acquire that mutex on behalf of + ``owner``. If ``owner`` is zero, the ioctl fails with + ``EINVAL``. + * - ``index`` + - On success, contains the index (into ``objs``) of the object + which was signaled. If ``alert`` was signaled instead, + this contains ``count``. + * - ``alert`` + - Optional event object file descriptor. If nonzero, this + specifies an "alert" event object which, if signaled, will + terminate the wait. If nonzero, the identifier must point to a + valid event. + * - ``flags`` + - Zero or more flags. Currently the only flag is + ``NTSYNC_WAIT_REALTIME``, which causes the timeout to be + measured against the REALTIME clock instead of MONOTONIC. + * - ``pad`` + - Unused, must be set to zero. + + This function attempts to acquire one of the given objects. If unable + to do so, it sleeps until an object becomes signaled, subsequently + acquiring it, or the timeout expires. In the latter case the ioctl + fails with ``ETIMEDOUT``. The function only acquires one object, even + if multiple objects are signaled. + + A semaphore is considered to be signaled if its count is nonzero, and + is acquired by decrementing its count by one. A mutex is considered + to be signaled if it is unowned or if its owner matches the ``owner`` + argument, and is acquired by incrementing its recursion count by one + and setting its owner to the ``owner`` argument. An auto-reset event + is acquired by designaling it; a manual-reset event is not affected + by acquisition. + + Acquisition is atomic and totally ordered with respect to other + operations on the same object. If two wait operations (with different + ``owner`` identifiers) are queued on the same mutex, only one is + signaled. If two wait operations are queued on the same semaphore, + and a value of one is posted to it, only one is signaled. + + If an abandoned mutex is acquired, the ioctl fails with + ``EOWNERDEAD``. Although this is a failure return, the function may + otherwise be considered successful. The mutex is marked as owned by + the given owner (with a recursion count of 1) and as no longer + abandoned, and ``index`` is still set to the index of the mutex. + + The ``alert`` argument is an "extra" event which can terminate the + wait, independently of all other objects. + + It is valid to pass the same object more than once, including by + passing the same event in the ``objs`` array and in ``alert``. If a + wakeup occurs due to that object being signaled, ``index`` is set to + the lowest index corresponding to that object. + + The function may fail with ``EINTR`` if a signal is received. + +.. c:macro:: NTSYNC_IOC_WAIT_ALL + + Poll on a list of objects, atomically acquiring all of them. Takes a + pointer to struct :c:type:`ntsync_wait_args`, which is used + identically to ``NTSYNC_IOC_WAIT_ANY``, except that ``index`` is + always filled with zero on success if not woken via alert. + + This function attempts to simultaneously acquire all of the given + objects. If unable to do so, it sleeps until all objects become + simultaneously signaled, subsequently acquiring them, or the timeout + expires. In the latter case the ioctl fails with ``ETIMEDOUT`` and no + objects are modified. + + Objects may become signaled and subsequently designaled (through + acquisition by other threads) while this thread is sleeping. Only + once all objects are simultaneously signaled does the ioctl acquire + them and return. The entire acquisition is atomic and totally ordered + with respect to other operations on any of the given objects. + + If an abandoned mutex is acquired, the ioctl fails with + ``EOWNERDEAD``. Similarly to ``NTSYNC_IOC_WAIT_ANY``, all objects are + nevertheless marked as acquired. Note that if multiple mutex objects + are specified, there is no way to know which were marked as + abandoned. + + As with "any" waits, the ``alert`` argument is an "extra" event which + can terminate the wait. Critically, however, an "all" wait will + succeed if all members in ``objs`` are signaled, *or* if ``alert`` is + signaled. In the latter case ``index`` will be set to ``count``. As + with "any" waits, if both conditions are filled, the former takes + priority, and objects in ``objs`` will be acquired. + + Unlike ``NTSYNC_IOC_WAIT_ANY``, it is not valid to pass the same + object more than once, nor is it valid to pass the same object in + ``objs`` and in ``alert``. If this is attempted, the function fails + with ``EINVAL``. diff --git a/Documentation/userspace-api/perf_ring_buffer.rst b/Documentation/userspace-api/perf_ring_buffer.rst index bde9d8cbc106..dc71544532ce 100644 --- a/Documentation/userspace-api/perf_ring_buffer.rst +++ b/Documentation/userspace-api/perf_ring_buffer.rst @@ -627,7 +627,7 @@ regular ring buffer. AUX events and AUX trace data are two different things. Let's see an example:: - perf record -a -e cycles -e cs_etm/@tmc_etr0/ -- sleep 2 + perf record -a -e cycles -e cs_etm// -- sleep 2 The above command enables two events: one is the event *cycles* from PMU and another is the AUX event *cs_etm* from Arm CoreSight, both are saved @@ -766,7 +766,7 @@ only record AUX trace data at a specific time point which users are interested in. E.g. below gives an example of how to take snapshots with 1 second interval with Arm CoreSight:: - perf record -e cs_etm/@tmc_etr0/u -S -a program & + perf record -e cs_etm//u -S -a program & PERFPID=$! while true; do kill -USR2 $PERFPID diff --git a/Documentation/userspace-api/sysfs-platform_profile.rst b/Documentation/userspace-api/sysfs-platform_profile.rst index 4fccde2e4563..6613e188242a 100644 --- a/Documentation/userspace-api/sysfs-platform_profile.rst +++ b/Documentation/userspace-api/sysfs-platform_profile.rst @@ -18,9 +18,9 @@ API for selecting the platform profile of these automatic mechanisms. Note that this API is only for selecting the platform profile, it is NOT a goal of this API to allow monitoring the resulting performance characteristics. Monitoring performance is best done with device/vendor -specific tools such as e.g. turbostat. +specific tools, e.g. turbostat. -Specifically when selecting a high performance profile the actual achieved +Specifically, when selecting a high performance profile the actual achieved performance may be limited by various factors such as: the heat generated by other components, room temperature, free air flow at the bottom of a laptop, etc. It is explicitly NOT a goal of this API to let userspace know @@ -40,3 +40,41 @@ added. Drivers which wish to introduce new profile names must: 1. Explain why the existing profile names cannot be used. 2. Add the new profile name, along with a clear description of the expected behaviour, to the sysfs-platform_profile ABI documentation. + +"Custom" profile support +======================== +The platform_profile class also supports profiles advertising a "custom" +profile. This is intended to be set by drivers when the settings in the +driver have been modified in a way that a standard profile doesn't represent +the current state. + +Multiple driver support +======================= +When multiple drivers on a system advertise a platform profile handler, the +platform profile handler core will only advertise the profiles that are +common between all drivers to the ``/sys/firmware/acpi`` interfaces. + +This is to ensure there is no ambiguity on what the profile names mean when +all handlers don't support a profile. + +Individual drivers will register a 'platform_profile' class device that has +similar semantics as the ``/sys/firmware/acpi/platform_profile`` interface. + +To discover which driver is associated with a platform profile handler the +user can read the ``name`` attribute of the class device. + +To discover available profiles from the class interface the user can read the +``choices`` attribute. + +If a user wants to select a profile for a specific driver, they can do so +by writing to the ``profile`` attribute of the driver's class device. + +This will allow users to set different profiles for different drivers on the +same system. If the selected profile by individual drivers differs the +platform profile handler core will display the profile 'custom' to indicate +that the profiles are not the same. + +While the ``platform_profile`` attribute has the value ``custom``, writing a +common profile from ``platform_profile_choices`` to the platform_profile +attribute of the platform profile handler core will set the profile for all +drivers. |