Age | Commit message (Collapse) | Author | Files | Lines |
|
In the case of XDP Multi-Buffer with Striding RQ, an extra
page is allocated for the linear part of non-linear SKBs.
Including headroom and tailroom in the calculation may
result in an unnecessary increase in the amount of memory
allocated. This could be critical, particularly for large
MTUs (e.g. 7975B) and large RQ sizes (e.g. 8192).
In this case, the requested page pool size is 64K, but
32K would be sufficient. This causes a failure due to
exceeding the page pool size limit of 32K.
Exclude headroom and tailroom from SKB size calculations
to reduce page pool size.
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Packet data buffers lack reserved headroom or tailroom,
and SKBs are allocated on a side memory when needed.
Exclude the tailroom from the SKB size calculations.
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
In current SWS debug dump mechanism we implement the seq_file interface,
but we only implement the 'show' callback to dump the whole steering DB
with a single call to this callback.
However, for large data size the seq_printf function will fail to
allocate a buffer with the adequate capacity to hold such data.
This patch solves this problem by utilizing the seq_file interface
mechanism in the following way:
- when the user triggers a dump procedure, we will allocate a list of
buffers that hold the whole data dump (in the start callback)
- using the start, next, show and stop callbacks of the seq_file
API we iterate through the list and dump the whole data
Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Lack of SyncE capability should not emit a warning, change the print to
debug level.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Device definitions belong in mlx5_ifc, remove the duplicates in
mlx5_core.h.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The function wait_fw_init() returns same error code either if it breaks
waiting due to timeout or other reason. Thus, the function callers print
error message on timeout without checking error type.
Return different error code for different failure reason and print error
message accordingly on wait_fw_init().
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When PF/VF teardown is called the driver sets the flag
MLX5_BREAK_FW_WAIT to stop waiting for FW loading and initializing. Same
should be applied to SF driver teardown to cut waiting time. On
mlx5_sf_dev_remove() set the flag before draining health WQ as recovery
flow may also wait for FW reloading while it is not relevant anymore.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
In case function is not a Physical Function it is not allowed to get FW
core dump, so if tried it will fail the fw health reporter dump option.
Instead of failing, remove the option of fw_fatal health reporter dump
for such function.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
In case function is not a Physical Function it is not allowed to collect
crdump, so if tried it will fail the fw_fatal health reporter dump
option. Instead of failing on permission, remove the option of fw_fatal
health reporter dump for such function.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Mlx5 has two functions with the same name mlx5_sf_dev_remove. Both are
static, in different files, so no compilation or logical issue, but it
makes it hard to follow the code and some traces even can get both as
one leads to the other [1]. Rename one to mlx5_sf_dev_remove_aux() as it
actually removes the auxiliary device of the SF.
[1]
mlx5_sf_dev_remove+0x2a/0x70 [mlx5_core]
auxiliary_bus_remove+0x18/0x30
device_release_driver_internal+0x199/0x200
bus_remove_device+0xd7/0x140
device_del+0x153/0x3d0
? process_one_work+0x16a/0x4b0
mlx5_sf_dev_remove+0x2e/0x90 [mlx5_core]
mlx5_sf_dev_table_destroy+0xa0/0x100 [mlx5_core]
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Fix counter name in documentation of mlx5 vnic health reporter diagnose
output: total_error_queues.
While here fix alignment in the documentation file of another counter,
comp_eq_overrun, as it should have its own line and not be part of
another counter's description.
Example:
$ devlink health diagnose pci/0000:00:04.0 reporter vnic
vNIC env counters:
total_error_queues: 0 send_queue_priority_update_flow: 0
comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
invalid_command: 0 quota_exceeded_command: 0
nic_receive_steering_discard: 0
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
After addition of HW managed counters and implementation drop
in flow steering logic, the code in driver which checks syndrome
is not reachable anymore.
Let's delete it.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Fill integrity, replay and bad trailer counters.
As an example, after simulating replay window attack with 5 packets:
[leonro@c ~]$ grep XfrmInStateSeqError /proc/net/xfrm_stat
XfrmInStateSeqError 5
[leonro@c ~]$ sudo ip -s x s
<...>
stats:
replay-window 0 replay 5 failed 0
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Iterate over all SAs in order to fill global IPsec statistics.
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
In order to allow drivers to fill all statistics, change the name
of xdo_dev_state_update_curlft to be xdo_dev_state_update_stats.
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add a Makefile for netdevsim selftests and add selftests path to
MAINTAINERS
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20240130214620.3722189-5-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Vladimir Oltean says:
====================
Fixups for qca8k ds->user_mii_bus cleanup
The series "ds->user_mii_bus cleanup (part 1)" from the last development
cycle:
https://patchwork.kernel.org/project/netdevbpf/cover/20240104140037.374166-1-vladimir.oltean@nxp.com/
had some review comments I didn't have the time to address at the time.
One from Alvin and one from Luiz. They can reasonably be treated as
improvements for v6.9.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It was pointed out during the review [1] of commit 68e1010cda79 ("net:
dsa: qca8k: put MDIO bus OF node on qca8k_mdio_register() failure") that
the rest of the qca8k driver uses "int ret" rather than "int err".
Make everything consistent in that regard, not only
qca8k_mdio_register(), but also qca8k_setup_mdio_bus().
[1] https://lore.kernel.org/netdev/qyl2w3ownx5q7363kqxib52j5htar4y6pkn7gen27rj45xr4on@pvy5agi6o2te/
Suggested-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It was pointed out during the review [1] of commit e66bf63a7f67 ("net:
dsa: qca8k: skip MDIO bus creation if its OF node has status =
"disabled"") that we now leak a reference to the "mdio" OF node if it is
disabled.
This is only a concern when using dynamic OF as far as I can tell (like
probing on an overlay), since OF nodes are never freed in the regular
case. Additionally, I'm unaware of any actual device trees (in
production or elsewhere) which have status = "disabled" for the MDIO OF
node. So handling this as a simple enhancement.
[1] https://lore.kernel.org/netdev/CAJq09z4--Ug+3FAmp=EimQ8HTQYOWOuVon-PUMGB5a1N=RPv4g@mail.gmail.com/
Suggested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These got misaligned after commit 6ca80638b90c ("net: dsa: Use conduit
and user terms").
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 1c870c63d7d2 ("net: fill in MODULE_DESCRIPTION()s for ocelot")
got a suggestion from Vladimir Oltean after it had landed in net-next.
Rewrite the module description according to Vladimir's suggestion.
Fixes: 1c870c63d7d2 ("net: fill in MODULE_DESCRIPTION()s for ocelot")
Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
lib/test_blackhole_dev.c sets a variable that is never read, causing
this following building warning:
lib/test_blackhole_dev.c:32:17: warning: variable 'ethh' set but not used [-Wunused-but-set-variable]
Remove the variable struct ethhdr *ethh, which is unused.
Fixes: 509e56b37cc3 ("blackhole_dev: add a selftest")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Matthieu Baerts says:
====================
mptcp: annotate lockless access
This is a series of 5 patches from Paolo to annotate lockless access.
The MPTCP locking schema is already quite complex. We need to clarify it
and make the lockless access already there consistent, or later changes
will be even harder to follow and understand.
This series goes through all the msk fields accessed in the RX and TX
path and makes the lockless annotation consistent with the in-use
locking schema.
As a bonus, this should fix data races eventually found by fuzzers --
even if we haven't seen many such reports so far.
Patch 1/5 hints we could remove "local_key" and "remote_key" from the
subflow context, and always use the ones from the msk socket, possibly
reducing the context memory usage. That change is left over as a
possible follow-up.
====================
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The following MPTCP socket fields:
- can_ack
- fully_established
- rcv_data_fin
- snd_data_fin_enable
- rcv_fastclose
- use_64bit_ack
are accessed without any lock, add the appropriate annotation.
The schema is safe as each field can change its value at most
once in the whole mptcp socket life cycle.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The token field is manipulated under the msk socket lock
and accessed lockless in a few spots, add proper ONCE annotation
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The following fields:
- ack_seq
- snd_una
- wnd_end
- rmem_fwd_alloc
are protected by the data lock end accessed lockless in a few
spots. Ensure ONCE annotation for write (under such lock) and for
lockless read.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The mptcp-level TX path info (write_seq, bytes_sent, snd_nxt) are under
the msk socket lock protection, and are accessed lockless in a few spots.
Always mark the write operations with WRITE_ONCE, read operations
outside the lock with READ_ONCE and drop the annotation for read
under such lock.
To simplify the annotations move mptcp_pending_data_fin_ack() from
__mptcp_data_acked() to __mptcp_clean_una(), under the msk socket
lock, where such call would belong.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both the local and the remote key follow the same locking
schema, put in place the proper ONCE accessors.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar chunk of code is used in tsnep_rx_poll_zc() and
tsnep_rx_reopen_xsk() to maintain the RX XDP_RING_NEED_WAKEUP flag.
Consolidate the code to common helper function.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can change sctp_sk() to propagate its argument const qualifier,
thanks to container_of_const().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can use a global dev_unreg_count counter instead
of a per netns one.
As a bonus we can factorize the changes done on it
for bulk device removals.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement the tun .get_channels functionality. This feature is necessary
for some tools, such as libxdp, which need to retrieve the queue count.
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This fixes the following code style problem:
- WARNING: please, no spaces at the start of a line
- CHECK: Please use a blank line after
function/struct/union/enum declarations
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds support for the RTL8126A found on Asus z790 Maximus Formula.
It was successfully tested w/o the firmware at 1000Mbps. Firmware file
has been provided by Realtek and submitted to linux-firmware.
2.5G and 5G modes are untested.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Paolo points out that ifconfig is legacy and we should not use it.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
By default lan8841's 1588 clock frequency is 125MHz. But when adjusting
the frequency, it is using the 1PPM format of the lan8814. Which is the
wrong format as lan8814 has a 1588 clock frequency of 250MHz. So then
for each 1PPM adjustment would adjust less than expected.
Therefore fix this by using the correct 1PPM format for lan8841.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Christian Marangi says:
====================
net: phy: qcom: qca808x: fixup qca808x LED
This is a bit embarassing and totally my fault so sorry for that!
While reworking the patch to phy_modify API, it was done a logic
error and made the brightness_set function broken. It wasn't
notice in last revisions test as the testing method was to verify
if hw control was correctly working.
Noticing this problem also made me notice an additional problem
with the polarity.
The introduced patch made the polarity configurable but I forgot
to add the required code to enable Active High by default.
(the PHY sets active low by default)
This wasn't notice with hw control testing as the LED blink on
traffic and polarity problem are not notice.
It might be worth discussing if needed a change in implementation
where the polarity function is always called but I think it's
better this way where specific PHY apply fixup with the help
of priv struct and on the config_init phase.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
qca808x PHY provide support for the led_polarity_set OP to configure
and apply the active-low property but on PHY reset, the Active High bit
is not set resulting in the LED driven as active-low.
To fix this, check if active-low is not set in DT and enable Active High
polarity by default to restore correct funcionality of the LED.
Fixes: 7196062b64ee ("net: phy: at803x: add LED support for qca808x")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In switching to using phy_modify_mmd and a more short version of the
LED ON/OFF condition in later revision, it was made a logic error where
value ? QCA808X_LED_FORCE_ON : QCA808X_LED_FORCE_OFF is always true as
value is always OR with QCA808X_LED_FORCE_EN due to missing ()
resulting in the testing condition being QCA808X_LED_FORCE_EN | value.
Add the () to apply the correct condition and restore correct
functionality of the brightness ON/OFF.
Fixes: 7196062b64ee ("net: phy: at803x: add LED support for qca808x")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jakub Kicinski says:
====================
tools: ynl: auto-gen for all genetlink families
The code gen has caught up with all features required in genetlink
families in Linux 6.8 already. We have also stopped committing auto-
-generated user space code to the tree. Instead of listing all the
families in the Makefile search the spec directory, and generate
code for everything that's not legacy netlink.
====================
Link: https://lore.kernel.org/r/20240202004926.447803-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of listing the genetlink families that we want to codegen
for, always codegen for everyone. We can add an opt-out later but
it seems like most families are not causing any issues, and yet
folks forget to add them to the Makefile.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240202004926.447803-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add ovs_flow, ovs_vport and ovs_datapath to the families supported
in C. ovs-flow has some circular nesting which is fun to deal with,
but the necessary support has been added already in the previous
release cycle.
Add a sample that proves that dealing with fixed headers does
actually work correctly.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240202004926.447803-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The DPLL and mptcp_pm families are pretty clean, and YNL C codegen
supports them fully with no changes. Add them to user space codegen
so that C samples can be written, and we know immediately if changes
to these families require YNL codegen work.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240202004926.447803-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Both addrconf_verify_work() and addrconf_dad_work() acquire rtnl,
there is no point trying to have one thread per cpu.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240201173031.3654257-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We don't have to store the EEE modes to be advertised in the driver,
phylib does this for us and stores it in phydev->advertising_eee.
phylib also takes care of properly handling the EEE advertisement.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/27c336a8-ea47-483d-815b-02c45ae41da2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A user reported that first consumer mainboards show up with a RTL8126A
5Gbps MAC/PHY. This adds support for the integrated PHY, which is also
available stand-alone. From a PHY driver perspective it's treated the
same as the 2.5Gbps PHY's, we just have to support the new PHY ID.
Reported-by: Joe Salmeri <jmscdba@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Joe Salmeri <jmscdba@gmail.com>
Link: https://lore.kernel.org/r/0c8e67ea-6505-43d1-bd51-94e7ecd6e222@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Michal Koutný says:
====================
net/sched: Load modules via alias
These modules may be loaded lazily without user's awareness and
control. Add respective aliases to modules and request them under these
aliases so that modprobe's blacklisting mechanism (through aliases)
works for them. (The same pattern exists e.g. for filesystem
modules.)
For example (before the change):
$ tc filter add dev lo parent 10: protocol ip prio 10 handle 1: cgroup
# cls_cgroup module is loaded despite a `blacklist cls_cgroup` entry
# in /etc/modprobe.d/*.conf
After the change:
$ tc filter add dev lo parent 10: protocol ip prio 10 handle 1: cgroup
Error: TC classifier not found.
We have an error talking to the kernel
# explicit/acknowledged (privileged) action is needed
$ modprobe cls_cgroup
# blacklist entry won't apply to this direct modprobe, module is
# loaded with awareness
A considered alternative was invoking `modprobe -b` always from
request_module(), however, dismissed as too intrusive and slightly
confusing in favor of the precedented aliases (the commit 7f78e0351394
("fs: Limit sys_mount to only request filesystem modules.").
User experience suffers in both alternatives. Its improvement is
orthogonal to blacklist honoring.
v1: https://lore.kernel.org/r/20231121175640.9981-1-mkoutny@suse.com
v2 https://lore.kernel.org/r/20231206192752.18989-1-mkoutny@suse.com
v3 https://lore.kernel.org/r/20240112180646.13232-1-mkoutny@suse.com
v4 https://lore.kernel.org/r/20240123135242.11430-1-mkoutny@suse.com
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
====================
Link: https://lore.kernel.org/r/20240201130943.19536-1-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The module sch_ingress stands out among net/sched modules
because it provides multiple act/sch functionalities in a single .ko.
They have aliases to make autoloading work for any of the provided
functionalities.
Since the autoloading was changed to uniformly request any functionality
under its alias, the non-systemic aliases can be removed now (i.e.
assuming the alias were only used to ensure autoloading).
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201130943.19536-5-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cls_,sch_,act_ modules may be loaded lazily during network
configuration but without user's awareness and control.
Switch the lazy loading from canonical module names to a module alias.
This allows finer control over lazy loading, the precedent from
commit 7f78e0351394 ("fs: Limit sys_mount to only request filesystem
modules.") explains it already:
Using aliases means user space can control the policy of which
filesystem^W net/sched modules are auto-loaded by editing
/etc/modprobe.d/*.conf with blacklist and alias directives.
Allowing simple, safe, well understood work-arounds to known
problematic software.
By default, nothing changes. However, if a specific module is
blacklisted (its canonical name), it won't be modprobe'd when requested
under its alias (i.e. kernel auto-loading). It would appear as if the
given module was unknown.
The module can still be loaded under its canonical name, which is an
explicit (privileged) user action.
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201130943.19536-4-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
No functional change intended, aliases will be used in followup commits.
Note for backporters: you may need to add aliases also for modules that
are already removed in mainline kernel but still in your version.
Patches were generated with the help of Coccinelle scripts like:
cat >scripts/coccinelle/misc/tcf_alias.cocci <<EOD
virtual patch
virtual report
@ haskernel @
@@
@ tcf_has_kind depends on report && haskernel @
identifier ops;
constant K;
@@
static struct tcf_proto_ops ops = {
.kind = K,
...
};
+char module_alias = K;
EOD
/usr/bin/spatch -D report --cocci-file scripts/coccinelle/misc/tcf_alias.cocci \
--dir . \
-I ./arch/x86/include -I ./arch/x86/include/generated -I ./include \
-I ./arch/x86/include/uapi -I ./arch/x86/include/generated/uapi \
-I ./include/uapi -I ./include/generated/uapi \
--include ./include/linux/compiler-version.h --include ./include/linux/kconfig.h \
--jobs 8 --chunksize 1 2>/dev/null | \
sed 's/char module_alias = "\([^"]*\)";/MODULE_ALIAS_NET_CLS("\1");/'
And analogously for:
static struct tc_action_ops ops = {
.kind = K,
static struct Qdisc_ops ops = {
.id = K,
(Someone familiar would be able to fit those into one .cocci file
without sed post processing.)
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201130943.19536-3-mkoutny@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|