summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-12-14ice: use VLAN proto from ring packet context in skb pathLarysa Zaremba2-10/+6
VLAN proto, used in ice XDP hints implementation is stored in ring packet context. Utilize this value in skb VLAN processing too instead of checking netdev features. At the same time, use vlan_tci instead of vlan_tag in touched code, because VLAN tag often refers to VLAN proto and VLAN TCI combined, while in the code we clearly store only VLAN TCI. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-12-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14ice: Implement VLAN tag hintLarysa Zaremba6-9/+59
Implement .xmo_rx_vlan_tag callback to allow XDP code to read packet's VLAN tag. At the same time, use vlan_tci instead of vlan_tag in touched code, because VLAN tag often refers to VLAN proto and VLAN TCI combined, while in the code we clearly store only VLAN TCI. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-11-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14xdp: Add VLAN tag hintLarysa Zaremba7-1/+57
Implement functionality that enables drivers to expose VLAN tag to XDP code. VLAN tag is represented by 2 variables: - protocol ID, which is passed to bpf code in BE - VLAN TCI, in host byte order Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20231205210847.28460-10-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14ice: Support XDP hints in AF_XDP ZC modeLarysa Zaremba2-0/+19
In AF_XDP ZC, xdp_buff is not stored on ring, instead it is provided by xsk_buff_pool. Space for metadata sources right after such buffers was already reserved in commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk"). Some things (such as pointer to packet context) do not change on a per-packet basis, so they can be set at the same time as RX queue info. On the other hand, RX descriptor is unique for each packet, but is already known when setting DMA addresses. This minimizes performance impact of hints on regular packet processing. Update AF_XDP ZC packet processing to support XDP hints. Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-9-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14xsk: add functions to fill control bufferMaciej Fijalkowski3-0/+31
Commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk") has added a buffer for custom data to xdp_buff_xsk. Particularly, this memory is used for data, consumed by XDP hints kfuncs. It does not always change on a per-packet basis and some parts can be set for example, at the same time as RX queue info. Add functions to fill all cbs in xsk_buff_pool with the same metadata. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-8-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14ice: Support RX hash XDP hintLarysa Zaremba3-204/+284
RX hash XDP hint requests both hash value and type. Type is XDP-specific, so we need a separate way to map these values to the hardware ptypes, so create a lookup table. Instead of creating a new long list, reuse contents of ice_decode_rx_desc_ptype[] through preprocessor. Current hash type enum does not contain ICMP packet type, but ice devices support it, so also add a new type into core code. Then use previously refactored code and create a function that allows XDP code to read RX hash. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-7-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14ice: Support HW timestamp hintLarysa Zaremba7-7/+42
Use previously refactored code and create a function that allows XDP code to read HW timestamp. Also, introduce packet context, where hints-related data will be stored. ice_xdp_buff contains only a pointer to this structure, to avoid copying it in ZC mode later in the series. HW timestamp is the first supported hint in the driver, so also add xdp_metadata_ops. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-6-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14ice: Introduce ice_xdp_buffLarysa Zaremba3-5/+30
In order to use XDP hints via kfuncs we need to put RX descriptor and miscellaneous data next to xdp_buff. Same as in hints implementations in other drivers, we achieve this through putting xdp_buff into a child structure. Currently, xdp_buff is stored in the ring structure, so replace it with union that includes child structure. This way enough memory is available while existing XDP code remains isolated from hints. Minimum size of the new child structure (ice_xdp_buff) is exactly 64 bytes (single cache line). To place it at the start of a cache line, move 'next' field from CL1 to CL4, as it isn't used often. This still leaves 192 bits available in CL3 for packet context extensions. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-5-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14ice: Make ptype internal to descriptor info processingLarysa Zaremba4-13/+16
Currently, rx_ptype variable is used only as an argument to ice_process_skb_fields() and is computed just before the function call. Therefore, there is no reason to pass this value as an argument. Instead, remove this argument and compute the value directly inside ice_process_skb_fields() function. Also, separate its calculation into a short function, so the code can later be reused in .xmo_() callbacks. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-4-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14ice: make RX HW timestamp reading code more reusableLarysa Zaremba3-20/+36
Previously, we only needed RX HW timestamp in skb path, hence all related code was written with skb in mind. But with the addition of XDP hints via kfuncs to the ice driver, the same logic will be needed in .xmo_() callbacks. Put generic process of reading RX HW timestamp from a descriptor into a separate function. Move skb-related code into another source file. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-3-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14ice: make RX hash reading code more reusableLarysa Zaremba1-11/+25
Previously, we only needed RX hash in skb path, hence all related code was written with skb in mind. But with the addition of XDP hints via kfuncs to the ice driver, the same logic will be needed in .xmo_() callbacks. Separate generic process of reading RX hash from a descriptor into a separate function. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-2-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14Merge branch 'bpf-token-support-in-libbpf-s-bpf-object'Alexei Starovoitov14-473/+1065
Andrii Nakryiko says: ==================== BPF token support in libbpf's BPF object Add fuller support for BPF token in high-level BPF object APIs. This is the most frequently used way to work with BPF using libbpf, so supporting BPF token there is critical. Patch #1 is improving kernel-side BPF_TOKEN_CREATE behavior by rejecting to create "empty" BPF token with no delegation. This seems like saner behavior which also makes libbpf's caching better overall. If we ever want to create BPF token with no delegate_xxx options set on BPF FS, we can use a new flag to enable that. Patches #2-#5 refactor libbpf internals, mostly feature detection code, to prepare it from BPF token FD. Patch #6 adds options to pass BPF token into BPF object open options. It also adds implicit BPF token creation logic to BPF object load step, even without any explicit involvement of the user. If the environment is setup properly, BPF token will be created transparently and used implicitly. This allows for all existing application to gain BPF token support by just linking with latest version of libbpf library. No source code modifications are required. All that under assumption that privileged container management agent properly set up default BPF FS instance at /sys/bpf/fs to allow BPF token creation. Patches #7-#8 adds more selftests, validating BPF object APIs work as expected under unprivileged user namespaced conditions in the presence of BPF token. Patch #9 extends libbpf with LIBBPF_BPF_TOKEN_PATH envvar knowledge, which can be used to override custom BPF FS location used for implicit BPF token creation logic without needing to adjust application code. This allows admins or container managers to mount BPF token-enabled BPF FS at non-standard location without the need to coordinate with applications. LIBBPF_BPF_TOKEN_PATH can also be used to disable BPF token implicit creation by setting it to an empty value. Patch #10 tests this new envvar functionality. v2->v3: - move some stray feature cache refactorings into patch #4 (Alexei); - add LIBBPF_BPF_TOKEN_PATH envvar support (Alexei); v1->v2: - remove minor code redundancies (Eduard, John); - add acks and rebase. ==================== Link: https://lore.kernel.org/r/20231213190842.3844987-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14selftests/bpf: add tests for LIBBPF_BPF_TOKEN_PATH envvarAndrii Nakryiko1-0/+112
Add new subtest validating LIBBPF_BPF_TOKEN_PATH envvar semantics. Extend existing test to validate that LIBBPF_BPF_TOKEN_PATH allows to disable implicit BPF token creation by setting envvar to empty string. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-11-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14libbpf: support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvarAndrii Nakryiko2-6/+21
To allow external admin authority to override default BPF FS location (/sys/fs/bpf) for implicit BPF token creation, teach libbpf to recognize LIBBPF_BPF_TOKEN_PATH envvar. If it is specified and user application didn't explicitly specify neither bpf_token_path nor bpf_token_fd option, it will be treated exactly like bpf_token_path option, overriding default /sys/fs/bpf location and making BPF token mandatory. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14selftests/bpf: add tests for BPF object load with implicit tokenAndrii Nakryiko1-0/+76
Add a test to validate libbpf's implicit BPF token creation from default BPF FS location (/sys/fs/bpf). Also validate that disabling this implicit BPF token creation works. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14selftests/bpf: add BPF object loading tests with explicit token passingAndrii Nakryiko3-0/+185
Add a few tests that attempt to load BPF object containing privileged map, program, and the one requiring mandatory BTF uploading into the kernel (to validate token FD propagation to BPF_BTF_LOAD command). Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14libbpf: wire up BPF token support at BPF object levelAndrii Nakryiko4-12/+158
Add BPF token support to BPF object-level functionality. BPF token is supported by BPF object logic either as an explicitly provided BPF token from outside (through BPF FS path or explicit BPF token FD), or implicitly (unless prevented through bpf_object_open_opts). Implicit mode is assumed to be the most common one for user namespaced unprivileged workloads. The assumption is that privileged container manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token delegation options (delegate_{cmds,maps,progs,attachs} mount options). BPF object during loading will attempt to create BPF token from /sys/fs/bpf location, and pass it for all relevant operations (currently, map creation, BTF load, and program load). In this implicit mode, if BPF token creation fails due to whatever reason (BPF FS is not mounted, or kernel doesn't support BPF token, etc), this is not considered an error. BPF object loading sequence will proceed with no BPF token. In explicit BPF token mode, user provides explicitly either custom BPF FS mount point path or creates BPF token on their own and just passes token FD directly. In such case, BPF object will either dup() token FD (to not require caller to hold onto it for entire duration of BPF object lifetime) or will attempt to create BPF token from provided BPF FS location. If BPF token creation fails, that is considered a critical error and BPF object load fails with an error. Libbpf provides a way to disable implicit BPF token creation, if it causes any troubles (BPF token is designed to be completely optional and shouldn't cause any problems even if provided, but in the world of BPF LSM, custom security logic can be installed that might change outcome dependin on the presence of BPF token). To disable libbpf's default BPF token creation behavior user should provide either invalid BPF token FD (negative), or empty bpf_token_path option. BPF token presence can influence libbpf's feature probing, so if BPF object has associated BPF token, feature probing is instructed to use BPF object-specific feature detection cache and token FD. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14libbpf: wire up token_fd into feature probing logicAndrii Nakryiko5-46/+66
Adjust feature probing callbacks to take into account optional token_fd. In unprivileged contexts, some feature detectors would fail to detect kernel support just because BPF program, BPF map, or BTF object can't be loaded due to privileged nature of those operations. So when BPF object is loaded with BPF token, this token should be used for feature probing. This patch is setting support for this scenario, but we don't yet pass non-zero token FD. This will be added in the next patch. We also switched BPF cookie detector from using kprobe program to tracepoint one, as tracepoint is somewhat less dangerous BPF program type and has higher likelihood of being allowed through BPF token in the future. This change has no effect on detection behavior. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14libbpf: move feature detection code into its own fileAndrii Nakryiko6-466/+479
It's quite a lot of well isolated code, so it seems like a good candidate to move it out of libbpf.c to reduce its size. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14libbpf: further decouple feature checking logic from bpf_objectAndrii Nakryiko3-11/+22
Add feat_supported() helper that accepts feature cache instead of bpf_object. This allows low-level code in bpf.c to not know or care about higher-level concept of bpf_object, yet it will be able to utilize custom feature checking in cases where BPF token might influence the outcome. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14libbpf: split feature detectors definitions from cached resultsAndrii Nakryiko1-6/+12
Split a list of supported feature detectors with their corresponding callbacks from actual cached supported/missing values. This will allow to have more flexible per-token or per-object feature detectors in subsequent refactorings. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14bpf: fail BPF_TOKEN_CREATE if no delegation option was set on BPF FSAndrii Nakryiko1-1/+9
It's quite confusing in practice when it's possible to successfully create a BPF token from BPF FS that didn't have any of delegate_xxx mount options set up. While it's not wrong, it's actually more meaningful to reject BPF_TOKEN_CREATE with specific error code (-ENOENT) to let user-space know that no token delegation is setup up. So, instead of creating empty BPF token that will be always ignored because it doesn't have any of the allow_xxx bits set, reject it with -ENOENT. If we ever need empty BPF token to be possible, we can support that with extra flag passed into BPF_TOKEN_CREATE. Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14bpf: selftests: Add verifier tests for CO-RE bitfield writesDaniel Xu2-0/+102
Add some tests that exercise BPF_CORE_WRITE_BITFIELD() macro. Since some non-trivial bit fiddling is going on, make sure various edge cases (such as adjacent bitfields and bitfields at the edge of structs) are exercised. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/72698a1080fa565f541d5654705255984ea2a029.1702325874.git.dxu@dxuuu.xyz Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-14bpf: selftests: test_loader: Support __btf_path() annotationDaniel Xu2-0/+8
This commit adds support for per-prog btf_custom_path. This is necessary for testing CO-RE relocations on non-vmlinux types using test_loader infrastructure. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/660ea7f2fdbdd5103bc1af87c9fc931f05327926.1702325874.git.dxu@dxuuu.xyz Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-14libbpf: Add BPF_CORE_WRITE_BITFIELD() macroDaniel Xu1-0/+32
=== Motivation === Similar to reading from CO-RE bitfields, we need a CO-RE aware bitfield writing wrapper to make the verifier happy. Two alternatives to this approach are: 1. Use the upcoming `preserve_static_offset` [0] attribute to disable CO-RE on specific structs. 2. Use broader byte-sized writes to write to bitfields. (1) is a bit hard to use. It requires specific and not-very-obvious annotations to bpftool generated vmlinux.h. It's also not generally available in released LLVM versions yet. (2) makes the code quite hard to read and write. And especially if BPF_CORE_READ_BITFIELD() is already being used, it makes more sense to to have an inverse helper for writing. === Implementation details === Since the logic is a bit non-obvious, I thought it would be helpful to explain exactly what's going on. To start, it helps by explaining what LSHIFT_U64 (lshift) and RSHIFT_U64 (rshift) is designed to mean. Consider the core of the BPF_CORE_READ_BITFIELD() algorithm: val <<= __CORE_RELO(s, field, LSHIFT_U64); val = val >> __CORE_RELO(s, field, RSHIFT_U64); Basically what happens is we lshift to clear the non-relevant (blank) higher order bits. Then we rshift to bring the relevant bits (bitfield) down to LSB position (while also clearing blank lower order bits). To illustrate: Start: ........XXX...... Lshift: XXX......00000000 Rshift: 00000000000000XXX where `.` means blank bit, `0` means 0 bit, and `X` means bitfield bit. After the two operations, the bitfield is ready to be interpreted as a regular integer. Next, we want to build an alternative (but more helpful) mental model on lshift and rshift. That is, to consider: * rshift as the total number of blank bits in the u64 * lshift as number of blank bits left of the bitfield in the u64 Take a moment to consider why that is true by consulting the above diagram. With this insight, we can now define the following relationship: bitfield _ | | 0.....00XXX0...00 | | | | |______| | | lshift | | |____| (rshift - lshift) That is, we know the number of higher order blank bits is just lshift. And the number of lower order blank bits is (rshift - lshift). Finally, we can examine the core of the write side algorithm: mask = (~0ULL << rshift) >> lshift; // 1 val = (val & ~mask) | ((nval << rpad) & mask); // 2 1. Compute a mask where the set bits are the bitfield bits. The first left shift zeros out exactly the number of blank bits, leaving a bitfield sized set of 1s. The subsequent right shift inserts the correct amount of higher order blank bits. 2. On the left of the `|`, mask out the bitfield bits. This creates 0s where the new bitfield bits will go. On the right of the `|`, bring nval into the correct bit position and mask out any bits that fall outside of the bitfield. Finally, by bor'ing the two halves, we get the final set of bits to write back. [0]: https://reviews.llvm.org/D133361 Co-developed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Co-developed-by: Jonathan Lemon <jlemon@aviatrix.com> Signed-off-by: Jonathan Lemon <jlemon@aviatrix.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/4d3dd215a4fd57d980733886f9c11a45e1a9adf3.1702325874.git.dxu@dxuuu.xyz Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-12-14bpf: Support uid and gid when mounting bpffsJie Jiang2-1/+51
Parse uid and gid in bpf_parse_param() so that they can be passed in as the `data` parameter when mount() bpffs. This will be useful when we want to control which user/group has the control to the mounted bpffs, otherwise a separate chown() call will be needed. Signed-off-by: Jie Jiang <jiejiang@chromium.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mike Frysinger <vapier@chromium.org> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231212093923.497838-1-jiejiang@chromium.org
2023-12-14Merge tag 'v6.7-rockchip-clkfixes1' of ↵Stephen Boyd2-15/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-fixes Pull Rockchip clk driver fixes for the merge window from Heiko Stuebner: Fixes for a wrong clockname, a wrong clock-parent, a wrong clock-gate and finally one new PLL rate for the rk3568 to fix display artifacts on a handheld devices based on that soc. * tag 'v6.7-rockchip-clkfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: clk: rockchip: rk3128: Fix SCLK_SDMMC's clock name clk: rockchip: rk3128: Fix aclk_peri_src's parent clk: rockchip: rk3128: Fix HCLK_OTG gate register clk: rockchip: rk3568: Add PLL rate for 292.5MHz
2023-12-14drm/amdgpu: warn when there are still mappings when a BO is destroyed v2Christian König1-0/+2
This can only happen when there is a reference counting bug. v2: fix typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14drm/amdgpu: fix tear down order in amdgpu_vm_pt_freeChristian König1-1/+2
When freeing PD/PT with shadows it can happen that the shadow destruction races with detaching the PD/PT from the VM causing a NULL pointer dereference in the invalidation code. Fix this by detaching the the PD/PT from the VM first and then freeing the shadow instead. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: https://gitlab.freedesktop.org/drm/amd/-/issues/2867 Cc: <stable@vger.kernel.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14drm/amd: Fix a probing order problem on SDMA 2.4Mario Limonciello1-2/+2
commit 751e293f2c99 ("drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4") made a fateful mistake in `adev->sdma.num_instances` wasn't declared when sdma_v2_4_init_microcode() was run. This caused probing to fail. Move the declaration to right before sdma_v2_4_init_microcode(). Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3043 Fixes: 751e293f2c99 ("drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-14drm/amdgpu/sdma5.2: add begin/end_use ring callbacksAlex Deucher1-0/+28
Add begin/end_use ring callbacks to disallow GFXOFF when SDMA work is submitted and allow it again afterward. This should avoid corner cases where GFXOFF is erroneously entered when SDMA is still active. For now just allow/disallow GFXOFF in the begin and end helpers until we root cause the issue. This should not impact power as SDMA usage is pretty minimal and GFXOSS should not be active when SDMA is active anyway, this just makes it explicit. v2: move everything into sdma5.2 code. No reason for this to be generic at this point. v3: Add comments in new code Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2220 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> (v1) Tested-by: Mario Limonciello <mario.limonciello@amd.com> (v1) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.15+
2023-12-13sign-file: Fix incorrect return values checkYusong Gao1-6/+6
There are some wrong return values check in sign-file when call OpenSSL API. The ERR() check cond is wrong because of the program only check the return value is < 0 which ignored the return val is 0. For example: 1. CMS_final() return 1 for success or 0 for failure. 2. i2d_CMS_bio_stream() returns 1 for success or 0 for failure. 3. i2d_TYPEbio() return 1 for success and 0 for failure. 4. BIO_free() return 1 for success and 0 for failure. Link: https://www.openssl.org/docs/manmaster/man3/ Fixes: e5a2e3c84782 ("scripts/sign-file.c: Add support for signing with a raw signature") Signed-off-by: Yusong Gao <a869920004@gmail.com> Reviewed-by: Juerg Haefliger <juerg.haefliger@canonical.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20231213024405.624692-1-a869920004@gmail.com/ # v5 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-13Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-1/+1
Pull ufs fix from Al Viro: "ufs got broken this merge window on folio conversion - calling conventions for filemap_lock_folio() are not the same as for find_lock_page()" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix ufs_get_locked_folio() breakage
2023-12-13Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set"Jakub Kicinski1-1/+1
This reverts commit f3f32a356c0d2379d4431364e74f101f8f075ce3. Paolo reports that the change disables autocorking even after the userspace sets TCP_CORK. Fixes: f3f32a356c0d ("tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set") Link: https://lore.kernel.org/r/0d30d5a41d3ac990573016308aaeacb40a9dc79f.camel@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13Merge tag 'efi-urgent-for-v6.7-2' of ↵Linus Torvalds4-13/+30
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Deal with a regression in the recently refactored x86 EFI stub code on older Dell systems by disabling randomization of the physical load address - Use the correct load address for relocatable Loongarch kernels * tag 'efi-urgent-for-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi/x86: Avoid physical KASLR on older Dell systems efi/loongarch: Use load address to calculate kernel entry address
2023-12-13i40e: Fix ST code value for Clause 45Ivan Vecera2-3/+3
ST code value for clause 45 that has been changed by commit 8196b5fd6c73 ("i40e: Refactor I40E_MDIO_CLAUSE* macros") is currently wrong. The mentioned commit refactored ..MDIO_CLAUSE??_STCODE_MASK so their value is the same for both clauses. The value is correct for clause 22 but not for clause 45. Fix the issue by adding a parameter to I40E_GLGEN_MSCA_STCODE_MASK macro that specifies required value. Fixes: 8196b5fd6c73 ("i40e: Refactor I40E_MDIO_CLAUSE* macros") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-13ice: fix theoretical out-of-bounds access in ethtool link modesMichal Schmidt1-2/+2
To map phy types reported by the hardware to ethtool link mode bits, ice uses two lookup tables (phy_type_low_lkup, phy_type_high_lkup). The "low" table has 64 elements to cover every possible bit the hardware may report, but the "high" table has only 13. If the hardware reports a higher bit in phy_types_high, the driver would access memory beyond the lookup table's end. Instead of iterating through all 64 bits of phy_types_{low,high}, use the sizes of the respective lookup tables. Fixes: 9136e1f1e5c3 ("ice: refactor PHY type to ethtool link mode") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-13selftests/bpf: fix compiler warnings in RELEASE=1 modeAndrii Nakryiko2-2/+2
When compiling BPF selftests with RELEASE=1, we get two new warnings, which are treated as errors. Fix them. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20231212225343.1723081-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13bcachefs: Fix determining required file handle lengthJan Kara1-5/+14
The ->encode_fh method is responsible for setting amount of space required for storing the file handle if not enough space was provided. bch2_encode_fh() was not setting required length in that case which breaks e.g. fanotify. Fix it. Reported-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-13drm/panel: ltk050h3146w: Set burst mode for ltk050h3148wFarouk Bouabid1-1/+1
The ltk050h3148w variant expects the horizontal component lane byte clock cycle(lbcc) to be calculated using lane_mbps (burst mode) instead of the pixel clock. Using the pixel clock rate by default for this calculation was introduced in commit ac87d23694f4 ("drm/bridge: synopsys: dw-mipi-dsi: Use pixel clock rate to calculate lbcc") and starting from commit 93e82bb4de01 ("drm/bridge: synopsys: dw-mipi-dsi: Fix hcomponent lbcc for burst mode") only panels that support burst mode can keep using the lane_mbps. So add MIPI_DSI_MODE_VIDEO_BURST as part of the mode_flags for the dsi host. Fixes: 93e82bb4de01 ("drm/bridge: synopsys: dw-mipi-dsi: Fix hcomponent lbcc for burst mode") Signed-off-by: Farouk Bouabid <farouk.bouabid@theobroma-systems.com> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231213145045.41020-1-farouk.bouabid@theobroma-systems.com
2023-12-13fix ufs_get_locked_folio() breakageAl Viro1-1/+1
filemap_lock_folio() returns ERR_PTR(-ENOENT) if the thing is not in cache - not NULL like find_lock_page() used to. Fixes: 5fb7bd50b351 "ufs: add ufs_get_locked_folio and ufs_put_locked_folio" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-12-13io_uring/poll: don't enable lazy wake for POLLEXCLUSIVEJens Axboe2-3/+20
There are a few quirks around using lazy wake for poll unconditionally, and one of them is related the EPOLLEXCLUSIVE. Those may trigger exclusive wakeups, which wake a limited number of entries in the wait queue. If that wake number is less than the number of entries someone is waiting for (and that someone is also using DEFER_TASKRUN), then we can get stuck waiting for more entries while we should be processing the ones we already got. If we're doing exclusive poll waits, flag the request as not being compatible with lazy wakeups. Reported-by: Pavel Begunkov <asml.silence@gmail.com> Fixes: 6ce4a93dbb5b ("io_uring/poll: use IOU_F_TWQ_LAZY_WAKE for wakeups") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-13Merge branch 'virtio-net-dynamic-coalescing-moderation'David S. Miller2-49/+249
Heng Qi says: ==================== virtio-net: support dynamic coalescing moderation Now, virtio-net already supports per-queue moderation parameter setting. Based on this, we use the linux dimlib to support dynamic coalescing moderation for virtio-net. Due to some scheduling issues, we only support and test the rx dim. Some test results: I. Sockperf UDP ================================================= 1. Env rxq_0 with affinity to cpu_0. 2. Cmd client: taskset -c 0 sockperf tp -p 8989 -i $IP -t 10 -m 16B server: taskset -c 0 sockperf sr -p 8989 3. Result dim off: 1143277.00 rxpps, throughput 17.844 MBps, cpu is 100%. dim on: 1124161.00 rxpps, throughput 17.610 MBps, cpu is 83.5%. ================================================= II. Redis ================================================= 1. Env There are 8 rxqs, and rxq_i with affinity to cpu_i. 2. Result When all cpus are 100%, ops/sec of memtier_benchmark client is dim off: 978437.23 dim on: 1143638.28 ================================================= III. Nginx ================================================= 1. Env There are 8 rxqs and rxq_i with affinity to cpu_i. 2. Result When all cpus are 100%, requests/sec of wrk client is dim off: 877931.67 dim on: 1019160.31 ================================================= IV. Latency of sockperf udp ================================================= 1. Rx cmd taskset -c 0 sockperf sr -p 8989 2. Tx cmd taskset -c 0 sockperf pp -i ${ip} -p 8989 -t 10 After running this cmd 5 times and averaging the results, 3. Result dim off: 17.7735 usec dim on: 18.0110 usec ================================================= Changelog: v7->v8: - Add select DIMLIB. v6->v7: - Drop the patch titled "spin lock for ctrl cmd access" - Use rtnl_trylock to avoid the deadlock. v5->v6: - Add patch(4/5): spin lock for ctrl cmd access - Patch(5/5): - Use spin lock and cancel_work_sync to synchronize v4->v5: - Patch(4/4): - Fix possible synchronization issues with cancel_work_sync. - Reduce if/else nesting levels v3->v4: - Patch(5/5): drop. v2->v3: - Patch(4/5): some minor modifications. v1->v2: - Patch(2/5): a minor fix. - Patch(4/5): - improve the judgment of dim switch conditions. - Cancel the work when vq reset. - Patch(5/5): drop the tx dim implementation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13virtio-net: support rx netdimHeng Qi2-14/+163
By comparing the traffic information in the complete napi processes, let the virtio-net driver automatically adjust the coalescing moderation parameters of each receive queue. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13virtio-net: extract virtqueue coalescig cmd for reuseHeng Qi1-42/+64
Extract commands to set virtqueue coalescing parameters for reuse by ethtool -Q, vq resize and netdim. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13virtio-net: separate rx/tx coalescing moderation cmdsHeng Qi1-3/+28
This patch separates the rx and tx global coalescing moderation commands to support netdim switches in subsequent patches. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13virtio-net: returns whether napi is completeHeng Qi1-1/+5
rx netdim needs to count the traffic during a complete napi process, and start updating and comparing samples to make decisions after the napi ends. Let virtqueue_napi_complete() return true if napi is done, otherwise vice versa. Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13Merge branch 'ionic-pci-errors'David S. Miller6-15/+81
Shannon Nelson says: ==================== ionic: updates to PCI error handling These are improvements to our PCI error handling, including FLR and AER events. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13ionic: fill out pci error handlersShannon Nelson1-0/+25
Set up the pci_error_handlers error_detected and resume to be useful in handling AER events. If the error detected is pci_channel_io_frozen we set up to do an FLR at the end of the AER handling - this tends to clear things up well enough that traffic can continue. Else, let the AER/PCI machinery do what is needed for the less serious errors seen. Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13ionic: lif debugfs refresh on resetShannon Nelson2-0/+4
Remove and restore the lif's debugfs pointers on a reset, and make sure to check for the dentry before removing it in case an earlier reset failed to rebuild the lif. Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>