Age | Commit message (Collapse) | Author | Files | Lines |
|
Subclassing the generic USB device driver to override the
default configuration selection regardless of matching interface
drivers.
The r815x family devices expose a vendor specific function which
the r8152 interface driver wants to handle. This is the preferred
device mode. Additionally one or more USB class functions are
usually supported for hosts lacking a vendor specific driver. The
choice is USB configuration based, with one alternate function per
configuration.
Example device with both NCM and ECM alternate cfgs:
T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 4 Spd=5000 MxCh= 0
D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 3
P: Vendor=0bda ProdID=8156 Rev=31.00
S: Manufacturer=Realtek
S: Product=USB 10/100/1G/2.5G LAN
S: SerialNumber=001000001
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=256mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=00 Driver=r8152
E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=83(I) Atr=03(Int.) MxPS= 2 Ivl=128ms
C: #Ifs= 2 Cfg#= 2 Atr=a0 MxPwr=256mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0d Prot=00 Driver=
E: Ad=83(I) Atr=03(Int.) MxPS= 16 Ivl=128ms
I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=01 Driver=
I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=
E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
C: #Ifs= 2 Cfg#= 3 Atr=a0 MxPwr=256mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=
E: Ad=83(I) Atr=03(Int.) MxPS= 16 Ivl=128ms
I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=
I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=
E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
A problem with this is that Linux will prefer class functions over
vendor specific functions. Using the above example, Linux defaults
to cfg #2, running the device in a sub-optimal NCM mode.
Previously we've attempted to work around the problem by
blacklisting the devices in the ECM class driver "cdc_ether", and
matching on the ECM class function in the vendor specific interface
driver. The latter has been used to switch back to the vendor
specific configuration when the driver is probed for a class
function.
This workaround has several issues;
- class driver blacklists is additional maintanence cruft in an
unrelated driver
- class driver blacklists prevents users from optionally running
the devices in class mode
- each device needs double match entries in the vendor driver
- the initial probing as a class function slows down device
discovery
Now these issues have become even worse with the introduction of
firmware supporting both NCM and ECM, where NCM ends up as the
default mode in Linux. To use the same workaround, we now have
to blacklist the devices in to two different class drivers and
add yet another match entry to the vendor specific driver.
This patch implements an alternative workaround strategy -
independent of the interface drivers. It avoids adding a
blacklist to the cdc_ncm driver and will let us remove the
existing blacklist from the cdc_ether driver.
As an additional bonus, removing the blacklists allow users to
select one of the other device modes if wanted.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Mat Martineau says:
====================
mptcp: Protocol in-use tracking and code cleanup
Here's a collection of commits from the MPTCP tree:
Patches 1-4 and 6 contain miscellaneous code cleanup for more consistent
use of helper functions, existing local variables, and better naming.
Patches 5, 7, and 9 add sock_prot_inuse tracking for MPTCP and an
associated self test.
Patch 8 modifies the mptcp_connect self test tool to exit on SIGUSR1
when in "slow mode".
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the function chk_msk_inuse() to diag.sh, which is used to check the
statistics of mptcp socket in use. As mptcp socket in listen state will
be closed randomly after 'accept', we need to get the count of listening
mptcp socket through 'ss' command.
All tests pass.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For now, mptcp_connect won't exit after receiving the 'SIGUSR1' signal
if '-r' is set. Fix this by skipping poll and sleep in copyfd_io_poll()
if 'quit' is set.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Do the statistics of mptcp socket in use with sock_prot_inuse_add().
Therefore, we can get the count of used mptcp socket from
/proc/net/protocols:
& cat /proc/net/protocols
protocol size sockets memory press maxhdr slab module cl co di ac io in de sh ss gs se re sp bi br ha uh gp em
MPTCPv6 2048 0 0 no 0 yes kernel y n y y y y y y y y y y n n n y y y n
MPTCP 1896 1 0 no 0 yes kernel y n y y y y y y y y y y n n n y y y n
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'ssk' should be more appropriate to be the name of the first argument
in mptcp_token_new_connect().
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The 'sk_prot' field in token KUNIT self-tests will be dereferenced in
mptcp_token_new_connect(). Therefore, init it with tcp_prot.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'sock->sk' is used frequently in mptcp_listen(). Therefore, we can
introduce the 'sk' and replace 'sock->sk' with it.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The local variable 'ssk' has been defined at the beginning of the function
mptcp_write_options(), use it instead of getting 'ssk' again.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the local variable 'net' instead of sock_net() in the functions where
the variable 'struct net *net' has been defined.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The helper msk_owned_by_me() is defined in protocol.h, so use it instead
of sock_owned_by_me().
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current source pushes skb into dev-done queue by calling
skb_dequeue_tail() and then pop it by skb_dequeue() to branch to
rx_cleanup state for freeing urb/skb in usbnet_bh(). It takes extra CPU
load, 2.21% (skb_queue_tail) as follows,
- 11.58% 0.26% swapper [k] usbnet_bh
- 11.32% usbnet_bh
- 6.43% skb_dequeue
6.34% _raw_spin_unlock_irqrestore
- 2.21% skb_queue_tail
2.19% _raw_spin_unlock_irqrestore
- 1.68% consume_skb
- 0.97% kfree_skbmem
0.80% kmem_cache_free
0.53% skb_release_data
To reduce the extra CPU load use return values to call helper function
usb_free_skb() to free the resources instead of calling skb_queue_tail()
and skb_dequeue() for push and pop respectively.
- 7.87% 0.25% swapper [k] usbnet_bh
- 7.62% usbnet_bh
- 4.81% skb_dequeue
4.74% _raw_spin_unlock_irqrestore
- 1.75% consume_skb
- 0.98% kfree_skbmem
0.78% kmem_cache_free
0.58% skb_release_data
0.53% smsc95xx_rx_fixup
Signed-off-by: Leesoo Ahn <lsahn@ooseel.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Divya Koppera says:
====================
Fixed warnings
Fixed warnings related to PTR_ERR and initialization.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Handle the NULL pointer case
Fixes New smatch warnings:
drivers/net/phy/micrel.c:2613 lan8814_ptp_probe_once() warn: passing zero to 'PTR_ERR'
vim +/PTR_ERR +2613 drivers/net/phy/micrel.c
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Divya Koppera <Divya.Koppera@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Initialized return variable
Fixes Old smatch warnings:
drivers/net/phy/micrel.c:1750 ksz886x_cable_test_get_status() error:
uninitialized symbol 'ret'.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Divya Koppera <Divya.Koppera@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jiawen Wu says:
====================
net: wangxun: Adjust code structure
Remove useless structs 'txgbe_hw' and 'ngbe_hw' make the codes clear.
And move the same codes which sets MAC address between txgbe and ngbe
to libwx. Further more, rename struct 'wx_hw' to 'wx' and move total
adapter members to wx.
====================
Link: https://lore.kernel.org/r/20230106033853.2806007-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the total private structure to libwx.
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the total private structure to libwx.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In order to move the total members in struct adapter to struct wx_hw
to keep the code clean, it's a bad name of 'wx_hw' only for hardware.
Rename 'wx_hw' to 'wx', and rename the pointers at use.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For setting MAC address, both txgbe and ngbe drivers have the same handling
flow with different parameters. Move the same codes to libwx.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove ngbe.h, move defines into ngbe_type.h file.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove txgbe.h, move defines into txgbe_type.h file.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove useless structure ngbe_hw to make the codes clear.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove useless structure txgbe_hw to make the codes clear.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The lan8814 represents a package of 4 PHYs. All of them are sharing the
same interrupt line. So when a link was going down/up or a frame was
timestamped, then the interrupt handler of all the PHYs was called.
Which is all fine and expected but the problem is the way the handler
interrupt works.
Basically if one of the PHYs timestamp a frame, then all the other 3
PHYs were polling the status of the interrupt until that PHY actually
cleared the interrupt by reading the timestamp.
The reason of polling was in case another PHY was also timestamping a
frame at the same time, it could miss this interrupt. But this is not
the right approach, because it is the interrupt controller who needs to
call the interrupt handlers again if the interrupt line is still
active.
Therefore change this such when the interrupt handler is called check
only if the interrupt is for itself, otherwise just exit. In this way
save CPU usage.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20230104194218.3785229-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Zero-length arrays are deprecated[1]. Replace struct ethtool_rxnfc's
"rule_locs" 0-length array with a flexible array. Detected with GCC 13,
using -fstrict-flex-arrays=3:
net/ethtool/common.c: In function 'ethtool_get_max_rxnfc_channel':
net/ethtool/common.c:558:55: warning: array subscript i is outside array bounds of '__u32[0]' {aka 'unsigned int[]'} [-Warray-bounds=]
558 | .fs.location = info->rule_locs[i],
| ~~~~~~~~~~~~~~~^~~
In file included from include/linux/ethtool.h:19,
from include/uapi/linux/ethtool_netlink.h:12,
from include/linux/ethtool_netlink.h:6,
from net/ethtool/common.c:3:
include/uapi/linux/ethtool.h:1186:41: note: while referencing
'rule_locs'
1186 | __u32 rule_locs[0];
| ^~~~~~~~~
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: kernel test robot <lkp@intel.com>
Cc: Oleksij Rempel <linux@rempel-privat.de>
Cc: Sean Anderson <sean.anderson@seco.com>
Cc: Alexandru Tachici <alexandru.tachici@analog.com>
Cc: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230106042844.give.885-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Zero-length arrays are deprecated[1]. Replace struct ipv6_rpl_sr_hdr's
"segments" union of 0-length arrays with flexible arrays. Detected with
GCC 13, using -fstrict-flex-arrays=3:
In function 'rpl_validate_srh',
inlined from 'rpl_build_state' at ../net/ipv6/rpl_iptunnel.c:96:7:
../net/ipv6/rpl_iptunnel.c:60:28: warning: array subscript <unknown> is outside array bounds of 'struct in6_addr[0]' [-Warray-bounds=]
60 | if (ipv6_addr_type(&srh->rpl_segaddr[srh->segments_left - 1]) &
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../include/net/rpl.h:12,
from ../net/ipv6/rpl_iptunnel.c:13:
../include/uapi/linux/rpl.h: In function 'rpl_build_state':
../include/uapi/linux/rpl.h:40:33: note: while referencing 'addr'
40 | struct in6_addr addr[0];
| ^~~~
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230105221533.never.711-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Zero-length arrays are deprecated[1]. Replace struct ioam6_trace_hdr's
"data" 0-length array with a flexible array. Detected with GCC 13,
using -fstrict-flex-arrays=3:
net/ipv6/ioam6_iptunnel.c: In function 'ioam6_build_state':
net/ipv6/ioam6_iptunnel.c:194:37: warning: array subscript <unknown> is outside array bounds of '__u8[0]' {aka 'unsigned char[]'} [-Warray-bounds=]
194 | tuninfo->traceh.data[trace->remlen * 4] = IPV6_TLV_PADN;
| ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
In file included from include/linux/ioam6.h:11,
from net/ipv6/ioam6_iptunnel.c:13:
include/uapi/linux/ioam6.h:130:17: note: while referencing 'data'
130 | __u8 data[0];
| ^~~~
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Justin Iurman <justin.iurman@uliege.be>
Tested-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://lore.kernel.org/r/20230105222115.never.661-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
devlink: remove the wait-for-references on unregister
Move the registration and unregistration of the devlink instances
under their instance locks. Don't perform the netdev-style wait
for all references when unregistering the instance.
Instead the devlink instance refcount will only ensure that
the memory of the instance is not freed. All places which acquire
access to devlink instances via a reference must check that the
instance is still registered under the instance lock.
This fixes the problem of the netdev code accessing devlink
instances before they are registered.
RFC: https://lore.kernel.org/all/20221217011953.152487-1-kuba@kernel.org/
- rewrite the cover letter
- rewrite the commit message for patch 1
- un-export and rename devl_is_alive
- squash the netdevsim patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To prevent races with netdev code accessing free devlink instances
move the registration under the devlink instance lock.
Core now waits for the instance to be registered before accessing it.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
err_dl_unregister should unregister the devlink instance.
Looks like renaming it was missed in one of the reshufflings.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's most natural to register the instance first and then its
subobjects. Now that we can use the instance lock to protect
the atomicity of all init - it should also be safe.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Requiring devlink_set_features() to be run before devlink is
registered is overzealous. devlink_set_features() itself is
a leftover from old workarounds which were trying to prevent
initiating reload before probe was complete.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The objective of exposing the devlink instance locks to
drivers was to let them use these locks to prevent user space
from accessing the device before it's fully initialized.
This is difficult because devlink_unregister() waits for all
references to be released, meaning that devlink_unregister()
can't itself be called under the instance lock.
To avoid this issue devlink_register() was moved after subobject
registration a while ago. Unfortunately the netdev paths get
a hold of the devlink instances _before_ they are registered.
Ideally netdev should wait for devlink init to finish (synchronizing
on the instance lock). This can't work because we don't know if the
instance will _ever_ be registered (in case of failures it may not).
The other option of returning an error until devlink_register()
is called is unappealing (user space would get a notification
netdev exist but would have to wait arbitrary amount of time
before accessing some of its attributes).
Weaken the guarantees of the devlink references.
Holding a reference will now only guarantee that the memory
of the object is around. Another way of looking at it is that
the reference now protects the object not its "registered" status.
Use devlink instance lock to synchronize unregistration.
This implies that releasing of the "main" reference of the devlink
instance moves from devlink_unregister() to devlink_free().
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Always check under the instance lock whether the devlink instance
is still / already registered.
This is a no-op for the most part, as the unregistration path currently
waits for all references. On the init path, however, we may temporarily
open up a race with netdev code, if netdevs are registered before the
devlink instance. This is temporary, the next change fixes it, and this
commit has been split out for the ease of review.
Note that in case of iterating over sub-objects which have their
own lock (regions and line cards) we assume an implicit dependency
between those objects existing and devlink unregistration.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
devlink->dev is assumed to be always valid as long as any
outstanding reference to the devlink instance exists.
In prep for weakening of the references take the instance lock.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
devlink_pernet_pre_exit() is the only obvious place which takes
the instance lock without using the devl_ helpers. Update the code
and move the error print after releasing the reference
(having unlock and put together feels slightly idiomatic).
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
xa_find_after() is designed to handle multi-index entries correctly.
If a xarray has two entries one which spans indexes 0-3 and one at
index 4 xa_find_after(0) will return the entry at index 4.
Having to juggle the two callbacks, however, is unnecessary in case
of the devlink xarray, as there is 1:1 relationship with indexes.
Always use xa_find() and increment the index manually.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All were not visible to the non-priv users inside netns. However,
with 4ecb90090c84 ("sysctl: allow override of /proc/sys/net with
CAP_NET_ADMIN"), these vars are protected from getting modified.
A proc with capable(CAP_NET_ADMIN) can change the values so
not having them visible inside netns is just causing nuisance to
process that check certain values (e.g. net.core.somaxconn) and
see different behavior in root-netns vs. other-netns
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jakub Kicinski says:
====================
devlink: code split and structured instance walk
Split devlink.c into a handful of files, trying to keep the "core"
code away from all the command-specific implementations.
The core code has been quite scattered until now. Going forward we can
consider using a source file per-subobject, I think that it's quite
beneficial to newcomers (based on relative ease with which folks
contribute to ethtool vs devlink). But this series doesn't split
everything out, yet - partially due to backporting concerns,
but mostly due to lack of time. Bulk of the netlink command
handling is left in a leftover.c file.
Introduce a context structure for dumps, and use it to store
the devlink instance ID of the last dumped devlink instance.
This means we don't have to restart the walk from 0 each time.
Finally - introduce a "structured walk". A centralized dump handler
in devlink/netlink.c which walks the devlink instances, deals with
refcounting/locking, simplifying the per-object implementations quite
a bit. Inspired by the ethtool code.
v1: https://lore.kernel.org/all/20230104041636.226398-1-kuba@kernel.org/
RFC: https://lore.kernel.org/all/20221215020155.1619839-1-kuba@kernel.org/
====================
Link: https://lore.kernel.org/r/20230105040531.353563-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Soon we'll have to check if a devlink instance is alive after
locking it. Convert to the by-instance dumping scheme to make
refactoring easier.
Most of the subobject code no longer has to worry about any devlink
locking / lifetime rules (the only ones that still do are the two subject
types which stubbornly use their own locking). Both dump and do callbacks
are given a devlink instance which is already locked and good-to-access
(do from the .pre_doit handler, dump from the new dump indirection).
Note that we'll now check presence of an op (e.g. for sb_pool_get)
under the devlink instance lock, that will soon be necessary anyway,
because we don't hold refs on the driver modules so the memory
in which ops live may be gone for a dead instance, after upcoming
locking changes.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Most dumpit implementations walk the devlink instances.
This requires careful lock taking and reference dropping.
Factor the loop out and provide just a callback to handle
a single instance dump.
Convert one user as an example, other users converted
in the next change.
Slightly inspired by ethtool netlink code.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the lock taking out of devlink_nl_cmd_region_get_devlink_dumpit().
This way all dumps will take the instance lock in the main iteration
loop directly, making refactoring and reading the code easier.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use xarray id for cases of sub-objects which are iterated in
a function.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use xarray id for cases of simple sub-object iteration.
We'll now use the state->instance for the devlink instances
and state->idx for subobject index.
Moving the definition of idx into the inner loop makes sense,
so while at it also move other sub-object local variables into
the loop.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
xarray gives each devlink instance an id and allows us to restart
walk based on that id quite neatly. This is nice both from the
perspective of code brevity and from the stability of the dump
(devlink instances disappearing from before the resumption point
will not cause inconsistent dumps).
This patch takes care of simple cases where state->idx counts
devlink instances only.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Walk devlink instances only once. Dump the instance reporters
and port reporters before moving to the next instance.
User space should not depend on ordering of messages.
This will make improving stability of the walk easier.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Looks like devlinks_xa_find_get() was intended to get the mark
from the @filter argument. It doesn't actually use @filter, passing
DEVLINK_REGISTERED to xa_find_fn() directly. Walking marks other
than registered is unlikely so drop @filter argument completely.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The start variables made the code clearer when we had to access
cb->args[0] directly, as the name args doesn't explain much.
Now that we use a structure to hold state this seems no longer
needed.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Create a dump context structure instead of using cb->args
as an unsigned long array. This is a pure conversion which
is intended to be as much of a noop as possible.
Subsequent changes will use this to simplify the code.
The two non-trivial parts are:
- devlink_nl_cmd_health_reporter_dump_get_dumpit() checks args[0]
to see if devlink_fmsg_dumpit() has already been called (whether
this is the first msg), but doesn't use the exact value, so we
can drop the local variable there already
- devlink_nl_cmd_region_read_dumpit() uses args[0] for address
but we'll use args[1] now, shouldn't matter
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|