Age | Commit message (Collapse) | Author | Files | Lines |
|
Lorenzo points out that the generic CLI is broken for the netdev
family. When I added the support for documentation of enums
(and sparse enums) the client script was not updated.
It expects the values in enum to be a list of names,
now it can also be a dict (YAML object).
Reported-by: Lorenzo Bianconi <lorenzo@kernel.org>
Fixes: e4b48ed460d3 ("tools: ynl: add a completely generic client")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move bulk of the EnumSet and EnumEntry code to shared
code for reuse by cli.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When sock_alloc_file fails to allocate a file, it will call sock_release.
__sys_socket_file should then not call sock_release again, otherwise there
will be a double free.
[ 89.319884] ------------[ cut here ]------------
[ 89.320286] kernel BUG at fs/inode.c:1764!
[ 89.320656] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 89.321051] CPU: 7 PID: 125 Comm: iou-sqp-124 Not tainted 6.2.0+ #361
[ 89.321535] RIP: 0010:iput+0x1ff/0x240
[ 89.321808] Code: d1 83 e1 03 48 83 f9 02 75 09 48 81 fa 00 10 00 00 77 05 83 e2 01 75 1f 4c 89 ef e8 fb d2 ba 00 e9 80 fe ff ff c3 cc cc cc cc <0f> 0b 0f 0b e9 d0 fe ff ff 0f 0b eb 8d 49 8d b4 24 08 01 00 00 48
[ 89.322760] RSP: 0018:ffffbdd60068bd50 EFLAGS: 00010202
[ 89.323036] RAX: 0000000000000000 RBX: ffff9d7ad3cacac0 RCX: 0000000000001107
[ 89.323412] RDX: 000000000003af00 RSI: 0000000000000000 RDI: ffff9d7ad3cacb40
[ 89.323785] RBP: ffffbdd60068bd68 R08: ffffffffffffffff R09: ffffffffab606438
[ 89.324157] R10: ffffffffacb3dfa0 R11: 6465686361657256 R12: ffff9d7ad3cacb40
[ 89.324529] R13: 0000000080000001 R14: 0000000080000001 R15: 0000000000000002
[ 89.324904] FS: 00007f7b28516740(0000) GS:ffff9d7aeb1c0000(0000) knlGS:0000000000000000
[ 89.325328] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 89.325629] CR2: 00007f0af52e96c0 CR3: 0000000002a02006 CR4: 0000000000770ee0
[ 89.326004] PKRU: 55555554
[ 89.326161] Call Trace:
[ 89.326298] <TASK>
[ 89.326419] __sock_release+0xb5/0xc0
[ 89.326632] __sys_socket_file+0xb2/0xd0
[ 89.326844] io_socket+0x88/0x100
[ 89.327039] ? io_issue_sqe+0x6a/0x430
[ 89.327258] io_issue_sqe+0x67/0x430
[ 89.327450] io_submit_sqes+0x1fe/0x670
[ 89.327661] io_sq_thread+0x2e6/0x530
[ 89.327859] ? __pfx_autoremove_wake_function+0x10/0x10
[ 89.328145] ? __pfx_io_sq_thread+0x10/0x10
[ 89.328367] ret_from_fork+0x29/0x50
[ 89.328576] RIP: 0033:0x0
[ 89.328732] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 89.329073] RSP: 002b:0000000000000000 EFLAGS: 00000202 ORIG_RAX: 00000000000001a9
[ 89.329477] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f7b28637a3d
[ 89.329845] RDX: 00007fff4e4318a8 RSI: 00007fff4e4318b0 RDI: 0000000000000400
[ 89.330216] RBP: 00007fff4e431830 R08: 00007fff4e431711 R09: 00007fff4e4318b0
[ 89.330584] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff4e441b38
[ 89.330950] R13: 0000563835e3e725 R14: 0000563835e40d10 R15: 00007f7b28784040
[ 89.331318] </TASK>
[ 89.331441] Modules linked in:
[ 89.331617] ---[ end trace 0000000000000000 ]---
Fixes: da214a475f8b ("net: add __sys_socket_file()")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230307173707.468744-1-cascardo@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported struct pid leak [1].
Issue is that queue_oob() calls maybe_add_creds() which potentially
holds a reference on a pid.
But skb->destructor is not set (either directly or by calling
unix_scm_to_skb())
This means that subsequent kfree_skb() or consume_skb() would leak
this reference.
In this fix, I chose to fully support scm even for the OOB message.
[1]
BUG: memory leak
unreferenced object 0xffff8881053e7f80 (size 128):
comm "syz-executor242", pid 5066, jiffies 4294946079 (age 13.220s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff812ae26a>] alloc_pid+0x6a/0x560 kernel/pid.c:180
[<ffffffff812718df>] copy_process+0x169f/0x26c0 kernel/fork.c:2285
[<ffffffff81272b37>] kernel_clone+0xf7/0x610 kernel/fork.c:2684
[<ffffffff812730cc>] __do_sys_clone+0x7c/0xb0 kernel/fork.c:2825
[<ffffffff849ad699>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff849ad699>] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
[<ffffffff84a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Reported-by: syzbot+7699d9e5635c10253a27@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rao Shoaib <rao.shoaib@oracle.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230307164530.771896-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit d5e2d038dbece821f1af57acbeded3aa9a1832c1.
We have a report of this chip being used on a
SURECOM EP-320X-S 100/10M Ethernet PCI Adapter
which could still have been purchased in some parts
of the world 3 years ago.
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217151
Fixes: d5e2d038dbec ("eth: fealnx: delete the driver for Myson MTD-800")
Link: https://lore.kernel.org/r/20230307171930.4008454-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The MT7530 switch from the MT7621 SoC has 2 ports which can be set up as
internal: port 5 and 6. Arınç reports that the GMAC1 attached to port 5
receives corrupted frames, unless port 6 (attached to GMAC0) has been
brought up by the driver. This is true regardless of whether port 5 is
used as a user port or as a CPU port (carrying DSA tags).
Offline debugging (blind for me) which began in the linked thread showed
experimentally that the configuration done by the driver for port 6
contains a step which is needed by port 5 as well - the write to
CORE_GSWPLL_GRP2 (note that I've no idea as to what it does, apart from
the comment "Set core clock into 500Mhz"). Prints put by Arınç show that
the reset value of CORE_GSWPLL_GRP2 is RG_GSWPLL_POSDIV_500M(1) |
RG_GSWPLL_FBKDIV_500M(40) (0x128), both on the MCM MT7530 from the
MT7621 SoC, as well as on the standalone MT7530 from MT7623NI Bananapi
BPI-R2. Apparently, port 5 on the standalone MT7530 can work under both
values of the register, while on the MT7621 SoC it cannot.
The call path that triggers the register write is:
mt753x_phylink_mac_config() for port 6
-> mt753x_pad_setup()
-> mt7530_pad_clk_setup()
so this fully explains the behavior noticed by Arınç, that bringing port
6 up is necessary.
The simplest fix for the problem is to extract the register writes which
are needed for both port 5 and 6 into a common mt7530_pll_setup()
function, which is called at mt7530_setup() time, immediately after
switch reset. We can argue that this mirrors the code layout introduced
in mt7531_setup() by commit 42bc4fafe359 ("net: mt7531: only do PLL once
after the reset"), in that the PLL setup has the exact same positioning,
and further work to consolidate the separate setup() functions is not
hindered.
Testing confirms that:
- the slight reordering of writes to MT7530_P6ECR and to
CORE_GSWPLL_GRP1 / CORE_GSWPLL_GRP2 introduced by this change does not
appear to cause problems for the operation of port 6 on MT7621 and on
MT7623 (where port 5 also always worked)
- packets sent through port 5 are not corrupted anymore, regardless of
whether port 6 is enabled by phylink or not (or even present in the
device tree)
My algorithm for determining the Fixes: tag is as follows. Testing shows
that some logic from mt7530_pad_clk_setup() is needed even for port 5.
Prior to commit ca366d6c889b ("net: dsa: mt7530: Convert to PHYLINK
API"), a call did exist for all phy_is_pseudo_fixed_link() ports - so
port 5 included. That commit replaced it with a temporary "Port 5 is not
supported!" comment, and the following commit 38f790a80560 ("net: dsa:
mt7530: Add support for port 5") replaced that comment with a
configuration procedure in mt7530_setup_port5() which was insufficient
for port 5 to work. I'm laying the blame on the patch that claimed
support for port 5, although one would have also needed the change from
commit c3b8e07909db ("net: dsa: mt7530: setup core clock even in TRGMII
mode") for the write to be performed completely independently from port
6's configuration.
Thanks go to Arınç for describing the problem, for debugging and for
testing.
Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/netdev/f297c2c4-6e7c-57ac-2394-f6025d309b9d@arinc9.com/
Fixes: 38f790a80560 ("net: dsa: mt7530: Add support for port 5")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230307155411.868573-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix deletion of existing DSCP mappings in the APP table.
Adding and deleting DSCP entries are replicated per-port, since the
mapping table is global for all ports in the chip. Whenever a mapping
for a DSCP value already exists, the old mapping is deleted first.
However, it is only deleted for the specified port. Fix this by calling
sparx5_dcb_ieee_delapp() instead of dcb_ieee_delapp() as it ought to be.
Reproduce:
// Map and remap DSCP value 63
$ dcb app add dev eth0 dscp-prio 63:1
$ dcb app add dev eth0 dscp-prio 63:2
$ dcb app show dev eth0 dscp-prio
dscp-prio 63:2
$ dcb app show dev eth1 dscp-prio
dscp-prio 63:1 63:2 <-- 63:1 should not be there
Fixes: 8dcf69a64118 ("net: microchip: sparx5: add support for offloading dscp table")
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
detection
NDC caches contexts of frequently used queue's (Rx and Tx queues)
contexts. Due to a HW errata when NDC detects fault/poision while
accessing contexts it could go into an illegal state where a cache
line could get locked forever. To makesure all cache lines in NDC
are available for optimum performance upon fault/lockerror/posion
errors scan through all cache lines in NDC and clear the lock bit.
Fixes: 4a3581cd5995 ("octeontx2-af: NPA AQ instruction enqueue support")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Sai Krishna <saikrishnag@marvell.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before determining whether the msg has unsupported options, it has been
prematurely terminated by the wrong status check.
For the application, the general usages of MSG_FASTOPEN likes
fd = socket(...)
/* rather than connect */
sendto(fd, data, len, MSG_FASTOPEN)
Hence, We need to check the flag before state check, because the sock
state here is always SMC_INIT when applications tries MSG_FASTOPEN.
Once we found unsupported options, fallback it to TCP.
Fixes: ee9dfbef02d1 ("net/smc: handle sockopts forcing fallback")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
v2 -> v1: Optimize code style
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I was intending to make all the Netlink Spec code BSD-3-Clause
to ease the adoption but it appears that:
- I fumbled the uAPI and used "GPL WITH uAPI note" there
- it gives people pause as they expect GPL in the kernel
As suggested by Chuck re-license under dual. This gives us benefit
of full BSD freedom while fulfilling the broad "kernel is under GPL"
expectations.
Link: https://lore.kernel.org/all/20230304120108.05dd44c5@kernel.org/
Link: https://lore.kernel.org/r/20230306200457.3903854-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Map all my old email addresses to current address.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/20230306194405.108236-1-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Map Maxim's old corporate addresses to his personal one.
Link: https://lore.kernel.org/r/20230306192018.3894988-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
cb_context should be freed on the error path in nfc_se_io as stated by
commit 25ff6f8a5a3b ("nfc: fix memory leak of se_io context in
nfc_genl_se_io").
Make the error path in nfc_se_io unwind everything in reverse order, i.e.
free the cb_context after unlocking the device.
Suggested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230306212650.230322-1-pchelkin@ispras.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the following Telit FE990 composition:
0x1080: tty, adb, rmnet, tty, tty, tty, tty
Signed-off-by: Enrico Sau <enrico.sau@gmail.com>
Link: https://lore.kernel.org/r/20230306120528.198842-1-enrico.sau@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add quirk CDC_MBIM_FLAG_AVOID_ALTSETTING_TOGGLE for Telit FE990
0x1081 composition in order to avoid bind error.
Signed-off-by: Enrico Sau <enrico.sau@gmail.com>
Link: https://lore.kernel.org/r/20230306115933.198259-1-enrico.sau@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Restore ctnetlink zero mark in events and dump, from Ivan Delalande.
2) Fix deadlock due to missing disabled bh in tproxy, from Florian Westphal.
3) Safer maximum chain load in conntrack, from Eric Dumazet.
* 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: conntrack: adopt safer max chain length
netfilter: tproxy: fix deadlock due to missing BH disable
netfilter: ctnetlink: revert to dumping mark regardless of event type
====================
Link: https://lore.kernel.org/r/20230307100424.2037-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Customers using GKE 1.25 and 1.26 are facing conntrack issues
root caused to commit c9c3b6811f74 ("netfilter: conntrack: make
max chain length random").
Even if we assume Uniform Hashing, a bucket often reachs 8 chained
items while the load factor of the hash table is smaller than 0.5
With a limit of 16, we reach load factors of 3.
With a limit of 32, we reach load factors of 11.
With a limit of 40, we reach load factors of 15.
With a limit of 50, we reach load factors of 24.
This patch changes MIN_CHAINLEN to 50, to minimize risks.
Ideally, we could in the future add a cushion based on expected
load factor (2 * nf_conntrack_max / nf_conntrack_buckets),
because some setups might expect unusual values.
Fixes: c9c3b6811f74 ("netfilter: conntrack: make max chain length random")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-03-06
We've added 8 non-merge commits during the last 7 day(s) which contain
a total of 9 files changed, 64 insertions(+), 18 deletions(-).
The main changes are:
1) Fix BTF resolver for DATASEC sections when a VAR points at a modifier,
that is, keep resolving such instances instead of bailing out,
from Lorenz Bauer.
2) Fix BPF test framework with regards to xdp_frame info misplacement
in the "live packet" code, from Alexander Lobakin.
3) Fix an infinite loop in BPF sockmap code for TCP/UDP/AF_UNIX,
from Liu Jian.
4) Fix a build error for riscv BPF JIT under PERF_EVENTS=n,
from Randy Dunlap.
5) Several BPF doc fixes with either broken links or external instead
of internal doc links, from Bagas Sanjaya.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: check that modifier resolves after pointer
btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES
bpf, doc: Link to submitting-patches.rst for general patch submission info
bpf, doc: Do not link to docs.kernel.org for kselftest link
bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
riscv, bpf: Fix patch_text implicit declaration
bpf, docs: Fix link to BTF doc
====================
Link: https://lore.kernel.org/r/20230306215944.11981-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Adrien reports that incorrect data is transmitted when a single
page straddles multiple records. We would transmit the same
data in all iterations of the loop.
Reported-by: Adrien Moulin <amoulin@corp.free.fr>
Link: https://lore.kernel.org/all/61481278.42813558.1677845235112.JavaMail.zimbra@corp.free.fr
Fixes: c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()")
Tested-by: Adrien Moulin <amoulin@corp.free.fr>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Link: https://lore.kernel.org/r/20230304192610.3818098-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix data corruption issue with SerDes connected PHYs operating at 1.25
Gbps speed where we could previously observe about 30% packet loss while
the bad packet counter was increasing.
As almost all boards with MediaTek MT7622 or MT7986 use either the MT7531
switch IC operating at 3.125Gbps SerDes rate or single-port PHYs using
rate-adaptation to 2500Base-X mode, this issue only got exposed now when
we started trying to use SFP modules operating with 1.25 Gbps with the
BananaPi R3 board.
The fix is to set bit 12 which disables the RX FIFO clear function when
setting up MAC MCR, MediaTek SDK did the same change stating:
"If without this patch, kernel might receive invalid packets that are
corrupted by GMAC."[1]
[1]: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/d8a2975939a12686c4a95c40db21efdc3f821f63
Fixes: 42c03844e93d ("net-next: mediatek: add support for MediaTek MT7622 SoC")
Tested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/138da2735f92c8b6f8578ec2e5a794ee515b665f.1677937317.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently link up can't be detected in forced mode if polling
isn't used. Only link up interrupt source we have is aneg
complete which isn't applicable in forced mode. Therefore we
have to use energy-on as link up indicator.
Fixes: 7365494550f6 ("net: phy: smsc: skip ENERGYON interrupt if disabled")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Lorenz Bauer says:
====================
See the first patch for a detailed explanation.
v2:
- Move RESOLVE_TBD assignment out of the loop (Martin)
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Add a regression test that ensures that a VAR pointing at a
modifier which follows a PTR (or STRUCT or ARRAY) is resolved
correctly by the datasec validator.
Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/r/20230306112138.155352-3-lmb@isovalent.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
btf_datasec_resolve contains a bug that causes the following BTF
to fail loading:
[1] DATASEC a size=2 vlen=2
type_id=4 offset=0 size=1
type_id=7 offset=1 size=1
[2] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none)
[3] PTR (anon) type_id=2
[4] VAR a type_id=3 linkage=0
[5] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none)
[6] TYPEDEF td type_id=5
[7] VAR b type_id=6 linkage=0
This error message is printed during btf_check_all_types:
[1] DATASEC a size=2 vlen=2
type_id=7 offset=1 size=1 Invalid type
By tracing btf_*_resolve we can pinpoint the problem:
btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_TBD) = 0
btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_TBD) = 0
btf_ptr_resolve(depth: 3, type_id: 3, mode: RESOLVE_PTR) = 0
btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_PTR) = 0
btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_PTR) = -22
The last invocation of btf_datasec_resolve should invoke btf_var_resolve
by means of env_stack_push, instead it returns EINVAL. The reason is that
env_stack_push is never executed for the second VAR.
if (!env_type_is_resolve_sink(env, var_type) &&
!env_type_is_resolved(env, var_type_id)) {
env_stack_set_next_member(env, i + 1);
return env_stack_push(env, var_type, var_type_id);
}
env_type_is_resolve_sink() changes its behaviour based on resolve_mode.
For RESOLVE_PTR, we can simplify the if condition to the following:
(btf_type_is_modifier() || btf_type_is_ptr) && !env_type_is_resolved()
Since we're dealing with a VAR the clause evaluates to false. This is
not sufficient to trigger the bug however. The log output and EINVAL
are only generated if btf_type_id_size() fails.
if (!btf_type_id_size(btf, &type_id, &type_size)) {
btf_verifier_log_vsi(env, v->t, vsi, "Invalid type");
return -EINVAL;
}
Most types are sized, so for example a VAR referring to an INT is not a
problem. The bug is only triggered if a VAR points at a modifier. Since
we skipped btf_var_resolve that modifier was also never resolved, which
means that btf_resolved_type_id returns 0 aka VOID for the modifier.
This in turn causes btf_type_id_size to return NULL, triggering EINVAL.
To summarise, the following conditions are necessary:
- VAR pointing at PTR, STRUCT, UNION or ARRAY
- Followed by a VAR pointing at TYPEDEF, VOLATILE, CONST, RESTRICT or
TYPE_TAG
The fix is to reset resolve_mode to RESOLVE_TBD before attempting to
resolve a VAR from a DATASEC.
Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec")
Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/r/20230306112138.155352-2-lmb@isovalent.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
&xdp_buff and &xdp_frame are bound in a way that
xdp_buff->data_hard_start == xdp_frame
It's always the case and e.g. xdp_convert_buff_to_frame() relies on
this.
IOW, the following:
for (u32 i = 0; i < 0xdead; i++) {
xdpf = xdp_convert_buff_to_frame(&xdp);
xdp_convert_frame_to_buff(xdpf, &xdp);
}
shouldn't ever modify @xdpf's contents or the pointer itself.
However, "live packet" code wrongly treats &xdp_frame as part of its
context placed *before* the data_hard_start. With such flow,
data_hard_start is sizeof(*xdpf) off to the right and no longer points
to the XDP frame.
Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
places and praying that there are no more miscalcs left somewhere in the
code, unionize ::frm with ::data in a flex array, so that both starts
pointing to the actual data_hard_start and the XDP frame actually starts
being a part of it, i.e. a part of the headroom, not the context.
A nice side effect is that the maximum frame size for this mode gets
increased by 40 bytes, as xdp_buff::frame_sz includes everything from
data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
info.
Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
hardcoded for 64 bit && 4k pages, it can be made more flexible later on.
Minor: align `&head->data` with how `head->frm` is assigned for
consistency.
Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
clarity.
(was found while testing XDP traffic generator on ice, which calls
xdp_convert_frame_to_buff() for each XDP frame)
Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230224163607.2994755-1-aleksander.lobakin@intel.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
The link for patch submission information in general refers to index
page for "Working with the kernel development community" section of
kernel docs, whereas the link should have been
Documentation/process/submitting-patches.rst instead.
Fix it by replacing the index target with the appropriate doc.
Fixes: 542228384888f5 ("bpf, doc: convert bpf_devel_QA.rst to use RST formatting")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230228074523.11493-3-bagasdotme@gmail.com
|
|
The question on how to run BPF selftests have a reference link to kernel
selftest documentation (Documentation/dev-tools/kselftest.rst). However,
it uses external link to the documentation at kernel.org/docs (aka
docs.kernel.org) instead, which requires Internet access.
Fix this and replace the link with internal linking, by using :doc: directive
while keeping the anchor text.
Fixes: b7a27c3aafa252 ("bpf, doc: howto use/run the BPF selftests")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230228074523.11493-2-bagasdotme@gmail.com
|
|
The xtables packet traverser performs an unconditional local_bh_disable(),
but the nf_tables evaluation loop does not.
Functions that are called from either xtables or nftables must assume
that they can be called in process context.
inet_twsk_deschedule_put() assumes that no softirq interrupt can occur.
If tproxy is used from nf_tables its possible that we'll deadlock
trying to aquire a lock already held in process context.
Add a small helper that takes care of this and use it.
Link: https://lore.kernel.org/netfilter-devel/401bd6ed-314a-a196-1cdc-e13c720cc8f2@balasys.hu/
Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support")
Reported-and-tested-by: Major Dávid <major.david@balasys.hu>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
It seems that change was unintentional, we have userspace code that
needs the mark while listening for events like REPLY, DESTROY, etc.
Also include 0-marks in requested dumps, as they were before that fix.
Fixes: 1feeae071507 ("netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark")
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Following warning reported by KASAN during driver unload
==================================================================
BUG: KASAN: double-free in bnxt_remove_one+0x103/0x200 [bnxt_en]
Free of addr ffff88814e8dd4c0 by task rmmod/17469
CPU: 47 PID: 17469 Comm: rmmod Kdump: loaded Tainted: G S 6.2.0-rc7+ #2
Hardware name: Dell Inc. PowerEdge R740/01YM03, BIOS 2.3.10 08/15/2019
Call Trace:
<TASK>
dump_stack_lvl+0x33/0x46
print_report+0x17b/0x4b3
? __call_rcu_common.constprop.79+0x27e/0x8c0
? __pfx_free_object_rcu+0x10/0x10
? __virt_addr_valid+0xe3/0x160
? bnxt_remove_one+0x103/0x200 [bnxt_en]
kasan_report_invalid_free+0x64/0xd0
? bnxt_remove_one+0x103/0x200 [bnxt_en]
? bnxt_remove_one+0x103/0x200 [bnxt_en]
__kasan_slab_free+0x179/0x1c0
? bnxt_remove_one+0x103/0x200 [bnxt_en]
__kmem_cache_free+0x194/0x350
bnxt_remove_one+0x103/0x200 [bnxt_en]
pci_device_remove+0x62/0x110
device_release_driver_internal+0xf6/0x1c0
driver_detach+0x76/0xe0
bus_remove_driver+0x89/0x160
pci_unregister_driver+0x26/0x110
? strncpy_from_user+0x188/0x1c0
bnxt_exit+0xc/0x24 [bnxt_en]
__x64_sys_delete_module+0x21f/0x390
? __pfx___x64_sys_delete_module+0x10/0x10
? __pfx_mem_cgroup_handle_over_high+0x10/0x10
? _raw_spin_lock+0x87/0xe0
? __pfx__raw_spin_lock+0x10/0x10
? __audit_syscall_entry+0x185/0x210
? ktime_get_coarse_real_ts64+0x51/0x80
? syscall_trace_enter.isra.18+0x126/0x1a0
do_syscall_64+0x37/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7effcb6fd71b
Code: 73 01 c3 48 8b 0d 6d 17 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 17 2c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffeada270b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 00005623660e0750 RCX: 00007effcb6fd71b
RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005623660e07b8
RBP: 0000000000000000 R08: 00007ffeada26031 R09: 0000000000000000
R10: 00007effcb771280 R11: 0000000000000206 R12: 00007ffeada272e0
R13: 00007ffeada28bc4 R14: 00005623660e02a0 R15: 00005623660e0750
</TASK>
Auxiliary device structures are freed in bnxt_aux_dev_release. So avoid
calling kfree from bnxt_remove_one.
Also, set bp->edev to NULL before freeing the auxilary private structure.
Fixes: d80d88b0dfff ("bnxt_en: Add auxiliary driver support")
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver needs to keep track of all the possible concurrent TPA (GRO/LRO)
completions on the aggregation ring. On P5 chips, the maximum number
of concurrent TPA is 256 and the amount of memory we allocate is order-5
on systems using 4K pages. Memory allocation failure has been reported:
NetworkManager: page allocation failure: order:5, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0-1
CPU: 15 PID: 2995 Comm: NetworkManager Kdump: loaded Not tainted 5.10.156 #1
Hardware name: Dell Inc. PowerEdge R660/0M1CC5, BIOS 0.2.25 08/12/2022
Call Trace:
dump_stack+0x57/0x6e
warn_alloc.cold.120+0x7b/0xdd
? _cond_resched+0x15/0x30
? __alloc_pages_direct_compact+0x15f/0x170
__alloc_pages_slowpath.constprop.108+0xc58/0xc70
__alloc_pages_nodemask+0x2d0/0x300
kmalloc_order+0x24/0xe0
kmalloc_order_trace+0x19/0x80
bnxt_alloc_mem+0x1150/0x15c0 [bnxt_en]
? bnxt_get_func_stat_ctxs+0x13/0x60 [bnxt_en]
__bnxt_open_nic+0x12e/0x780 [bnxt_en]
bnxt_open+0x10b/0x240 [bnxt_en]
__dev_open+0xe9/0x180
__dev_change_flags+0x1af/0x220
dev_change_flags+0x21/0x60
do_setlink+0x35c/0x1100
Instead of allocating this big chunk of memory and dividing it up for the
concurrent TPA instances, allocate each small chunk separately for each
TPA instance. This will reduce it to order-0 allocations.
Fixes: 79632e9ba386 ("bnxt_en: Expand bnxt_tpa_info struct to support 57500 chips.")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The locking in phy_probe() and phy_remove() does very little to prevent
any races with e.g. phy_attach_direct(), but instead causes lockdep ABBA
warnings. Remove it.
======================================================
WARNING: possible circular locking dependency detected
6.2.0-dirty #1108 Tainted: G W E
------------------------------------------------------
ip/415 is trying to acquire lock:
ffff5c268f81ef50 (&dev->lock){+.+.}-{3:3}, at: phy_attach_direct+0x17c/0x3a0 [libphy]
but task is already holding lock:
ffffaef6496cb518 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x154/0x560
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (rtnl_mutex){+.+.}-{3:3}:
__lock_acquire+0x35c/0x6c0
lock_acquire.part.0+0xcc/0x220
lock_acquire+0x68/0x84
__mutex_lock+0x8c/0x414
mutex_lock_nested+0x34/0x40
rtnl_lock+0x24/0x30
sfp_bus_add_upstream+0x34/0x150
phy_sfp_probe+0x4c/0x94 [libphy]
mv3310_probe+0x148/0x184 [marvell10g]
phy_probe+0x8c/0x200 [libphy]
call_driver_probe+0xbc/0x15c
really_probe+0xc0/0x320
__driver_probe_device+0x84/0x120
driver_probe_device+0x44/0x120
__device_attach_driver+0xc4/0x160
bus_for_each_drv+0x80/0xe0
__device_attach+0xb0/0x1f0
device_initial_probe+0x1c/0x2c
bus_probe_device+0xa4/0xb0
device_add+0x360/0x53c
phy_device_register+0x60/0xa4 [libphy]
fwnode_mdiobus_phy_device_register+0xc0/0x190 [fwnode_mdio]
fwnode_mdiobus_register_phy+0x160/0xd80 [fwnode_mdio]
of_mdiobus_register+0x140/0x340 [of_mdio]
orion_mdio_probe+0x298/0x3c0 [mvmdio]
platform_probe+0x70/0xe0
call_driver_probe+0x34/0x15c
really_probe+0xc0/0x320
__driver_probe_device+0x84/0x120
driver_probe_device+0x44/0x120
__driver_attach+0x104/0x210
bus_for_each_dev+0x78/0xdc
driver_attach+0x2c/0x3c
bus_add_driver+0x184/0x240
driver_register+0x80/0x13c
__platform_driver_register+0x30/0x3c
xt_compat_calc_jump+0x28/0xa4 [x_tables]
do_one_initcall+0x50/0x1b0
do_init_module+0x50/0x1fc
load_module+0x684/0x744
__do_sys_finit_module+0xc4/0x140
__arm64_sys_finit_module+0x28/0x34
invoke_syscall+0x50/0x120
el0_svc_common.constprop.0+0x6c/0x1b0
do_el0_svc+0x34/0x44
el0_svc+0x48/0xf0
el0t_64_sync_handler+0xb8/0xc0
el0t_64_sync+0x1a0/0x1a4
-> #0 (&dev->lock){+.+.}-{3:3}:
check_prev_add+0xb4/0xc80
validate_chain+0x414/0x47c
__lock_acquire+0x35c/0x6c0
lock_acquire.part.0+0xcc/0x220
lock_acquire+0x68/0x84
__mutex_lock+0x8c/0x414
mutex_lock_nested+0x34/0x40
phy_attach_direct+0x17c/0x3a0 [libphy]
phylink_fwnode_phy_connect.part.0+0x70/0xe4 [phylink]
phylink_fwnode_phy_connect+0x48/0x60 [phylink]
mvpp2_open+0xec/0x2e0 [mvpp2]
__dev_open+0x104/0x214
__dev_change_flags+0x1d4/0x254
dev_change_flags+0x2c/0x7c
do_setlink+0x254/0xa50
__rtnl_newlink+0x430/0x514
rtnl_newlink+0x58/0x8c
rtnetlink_rcv_msg+0x17c/0x560
netlink_rcv_skb+0x64/0x150
rtnetlink_rcv+0x20/0x30
netlink_unicast+0x1d4/0x2b4
netlink_sendmsg+0x1a4/0x400
____sys_sendmsg+0x228/0x290
___sys_sendmsg+0x88/0xec
__sys_sendmsg+0x70/0xd0
__arm64_sys_sendmsg+0x2c/0x40
invoke_syscall+0x50/0x120
el0_svc_common.constprop.0+0x6c/0x1b0
do_el0_svc+0x34/0x44
el0_svc+0x48/0xf0
el0t_64_sync_handler+0xb8/0xc0
el0t_64_sync+0x1a0/0x1a4
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(rtnl_mutex);
lock(&dev->lock);
lock(rtnl_mutex);
lock(&dev->lock);
*** DEADLOCK ***
Fixes: 298e54fa810e ("net: phy: add core phylib sfp support")
Reported-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When MAC is not support PMT, driver will check PHY's WoL capability
and set device wakeup capability in stmmac_init_phy(). We can enable
the WoL through ethtool, the driver would enable the device wake up
flag. Now the device_may_wakeup() return true.
But if there is a way which enable the PHY's WoL capability derectly,
like in BIOS. The driver would not know the enable thing and would not
set the device wake up flag. The phy_suspend may failed like this:
[ 32.409063] PM: dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x50 returns -16
[ 32.409065] PM: Device stmmac-1:00 failed to suspend: error -16
[ 32.409067] PM: Some devices failed to suspend, or early wake event detected
Add to set the device wakeup enable flag according to the get_wol
function result in PHY can fix the error in this scene.
v2: add a Fixes tag.
Fixes: 1d8e5b0f3f2c ("net: stmmac: Support WOL with phy")
Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tcp_bpf_recvmsg_parser()
When the buffer length of the recvmsg system call is 0, we got the
flollowing soft lockup problem:
watchdog: BUG: soft lockup - CPU#3 stuck for 27s! [a.out:6149]
CPU: 3 PID: 6149 Comm: a.out Kdump: loaded Not tainted 6.2.0+ #30
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:remove_wait_queue+0xb/0xc0
Code: 5e 41 5f c3 cc cc cc cc 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 57 <41> 56 41 55 41 54 55 48 89 fd 53 48 89 f3 4c 8d 6b 18 4c 8d 73 20
RSP: 0018:ffff88811b5978b8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff88811a7d3780 RCX: ffffffffb7a4d768
RDX: dffffc0000000000 RSI: ffff88811b597908 RDI: ffff888115408040
RBP: 1ffff110236b2f1b R08: 0000000000000000 R09: ffff88811a7d37e7
R10: ffffed10234fa6fc R11: 0000000000000001 R12: ffff88811179b800
R13: 0000000000000001 R14: ffff88811a7d38a8 R15: ffff88811a7d37e0
FS: 00007f6fb5398740(0000) GS:ffff888237180000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 000000010b6ba002 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
tcp_msg_wait_data+0x279/0x2f0
tcp_bpf_recvmsg_parser+0x3c6/0x490
inet_recvmsg+0x280/0x290
sock_recvmsg+0xfc/0x120
____sys_recvmsg+0x160/0x3d0
___sys_recvmsg+0xf0/0x180
__sys_recvmsg+0xea/0x1a0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
The logic in tcp_bpf_recvmsg_parser is as follows:
msg_bytes_ready:
copied = sk_msg_recvmsg(sk, psock, msg, len, flags);
if (!copied) {
wait data;
goto msg_bytes_ready;
}
In this case, "copied" always is 0, the infinite loop occurs.
According to the Linux system call man page, 0 should be returned in this
case. Therefore, in tcp_bpf_recvmsg_parser(), if the length is 0, directly
return. Also modify several other functions with the same problem.
Fixes: 1f5be6b3b063 ("udp: Implement udp_bpf_recvmsg() for sockmap")
Fixes: 9825d866ce0d ("af_unix: Implement unix_dgram_bpf_recvmsg()")
Fixes: c5d2177a72a1 ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230303080946.1146638-1-liujian56@huawei.com
|
|
Simon Horman says:
====================
nfp: fix incorrect IPsec checksum handling
this short series resolves two problems with IPsec checksum handling
in the nfp driver.
* PATCH 1/3, 2/3: Correct setting of checksum flags.
One patch for each of the nfd3 and nfdk datapaths.
* Patch 3/3: Correct configuration of NETIF_F_CSUM_MASK
so that the stack does not unecessarily calculate csums for
IPsec offload packets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When esp-tx-csum-offload is set to on, the protocol stack shouldn't
calculate the IPsec offload packet's csum, but it does. Because the
callback `.ndo_features_check` incorrectly masked NETIF_F_CSUM_MASK bit.
Fixes: 57f273adbcd4 ("nfp: add framework to support ipsec offloading")
Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The csum flag of IPsec packet are set repeatedly. Therefore, the csum
flag set of IPsec and non-IPsec packet need to be distinguished.
As the ipv6 header does not have a csum field, so l3-csum flag is not
required to be set for ipv6 case.
Fixes: 436396f26d50 ("nfp: support IPsec offloading for NFP3800")
Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The csum flag of IPsec packet are set repeatedly. Therefore, the csum
flag set of IPsec and non-IPsec packet need to be distinguished.
As the ipv6 header does not have a csum field, so l3-csum flag is not
required to be set for ipv6 case.
L4-csum flag include the tcp csum flag and udp csum flag, we shouldn't
set the udp and tcp csum flag at the same time for one packet, should
set l4-csum flag according to the transport layer is tcp or udp.
Fixes: 57f273adbcd4 ("nfp: add framework to support ipsec offloading")
Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ice_get_module_eeprom() is broken since commit e9c9692c8a81 ("ice:
Reimplement module reads used by ethtool") In this refactor,
ice_get_module_eeprom() reads the eeprom in blocks of size 8.
But the condition that should protect the buffer overflow
ignores the last block. The last block always contains zeros.
Bug uncovered by ethtool upstream commit 9538f384b535
("netlink: eeprom: Defer page requests to individual parsers")
After this commit, ethtool reads a block with length = 1;
to read the SFF-8024 identifier value.
unpatched driver:
$ ethtool -m enp65s0f0np0 offset 0x90 length 8
Offset Values
------ ------
0x0090: 00 00 00 00 00 00 00 00
$ ethtool -m enp65s0f0np0 offset 0x90 length 12
Offset Values
------ ------
0x0090: 00 00 01 a0 4d 65 6c 6c 00 00 00 00
$
$ ethtool -m enp65s0f0np0
Offset Values
------ ------
0x0000: 11 06 06 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0060: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 08 00
0x0070: 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00
patched driver:
$ ethtool -m enp65s0f0np0 offset 0x90 length 8
Offset Values
------ ------
0x0090: 00 00 01 a0 4d 65 6c 6c
$ ethtool -m enp65s0f0np0 offset 0x90 length 12
Offset Values
------ ------
0x0090: 00 00 01 a0 4d 65 6c 6c 61 6e 6f 78
$ ethtool -m enp65s0f0np0
Identifier : 0x11 (QSFP28)
Extended identifier : 0x00
Extended identifier description : 1.5W max. Power consumption
Extended identifier description : No CDR in TX, No CDR in RX
Extended identifier description : High Power Class (> 3.5 W) not enabled
Connector : 0x23 (No separable connector)
Transceiver codes : 0x88 0x00 0x00 0x00 0x00 0x00 0x00 0x00
Transceiver type : 40G Ethernet: 40G Base-CR4
Transceiver type : 25G Ethernet: 25G Base-CR CA-N
Encoding : 0x05 (64B/66B)
BR, Nominal : 25500Mbps
Rate identifier : 0x00
Length (SMF,km) : 0km
Length (OM3 50um) : 0m
Length (OM2 50um) : 0m
Length (OM1 62.5um) : 0m
Length (Copper or Active cable) : 1m
Transmitter technology : 0xa0 (Copper cable unequalized)
Attenuation at 2.5GHz : 4db
Attenuation at 5.0GHz : 5db
Attenuation at 7.0GHz : 7db
Attenuation at 12.9GHz : 10db
........
....
Fixes: e9c9692c8a81 ("ice: Reimplement module reads used by ethtool")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jakub Kicinski says:
====================
tools: ynl: fix subset use and change default value for attrs/ops
Fix a problem in subsetting, which will become apparent when
the devlink family comes after the merge window. Even tho none
of the existing families need this, we don't want someone to
get "inspired" by the current, incorrect code when using specs
in other languages.
Change the default value for the first attr/op. This is a slight
behavior change so needs to go in now. The diffstat of the last
patch should serve as the clearest justification there..
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now that the codegen rules had been changed we can update
the specs to reflect the new default.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pretty much all families use value: 1 or reserve as unspec
the first entry in attribute set and the first operation.
Make this the default. Update documentation (the doc for
values of operations just refers back to doc for attrs
so updating only attrs).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To avoid having to repeat the entire definition of an attribute
(including the value) use the Attr object from the original set.
In fact this is already the documented expectation.
Fixes: be5bea1cc0bf ("net: add basic C code generators for Netlink")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan
Stefan Schmidt says:
====================
ieee802154 for net 2023-03-02
Two small fixes this time.
Alexander Aring fixed a potential negative array access in the ca8210
driver.
Miquel Raynal fixed a crash that could have been triggered through
the extended netlink API for 802154. This only came in this merge window.
Found by syzkaller.
* tag 'ieee802154-for-net-2023-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan:
ieee802154: Prevent user from crashing the host
ca8210: fix mac_len negative array access
====================
Link: https://lore.kernel.org/r/20230302153032.1312755-1-stefan@datenfreihafen.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported use-after-free in cfusbl_device_notify() [1]. This
causes a stack trace like below:
BUG: KASAN: use-after-free in cfusbl_device_notify+0x7c9/0x870 net/caif/caif_usb.c:138
Read of size 8 at addr ffff88807ac4e6f0 by task kworker/u4:6/1214
CPU: 0 PID: 1214 Comm: kworker/u4:6 Not tainted 5.19.0-rc3-syzkaller-00146-g92f20ff72066 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description.constprop.0.cold+0xeb/0x467 mm/kasan/report.c:313
print_report mm/kasan/report.c:429 [inline]
kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
cfusbl_device_notify+0x7c9/0x870 net/caif/caif_usb.c:138
notifier_call_chain+0xb5/0x200 kernel/notifier.c:87
call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1945
call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
call_netdevice_notifiers net/core/dev.c:1997 [inline]
netdev_wait_allrefs_any net/core/dev.c:10227 [inline]
netdev_run_todo+0xbc0/0x10f0 net/core/dev.c:10341
default_device_exit_batch+0x44e/0x590 net/core/dev.c:11334
ops_exit_list+0x125/0x170 net/core/net_namespace.c:167
cleanup_net+0x4ea/0xb00 net/core/net_namespace.c:594
process_one_work+0x996/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
</TASK>
When unregistering a net device, unregister_netdevice_many_notify()
sets the device's reg_state to NETREG_UNREGISTERING, calls notifiers
with NETDEV_UNREGISTER, and adds the device to the todo list.
Later on, devices in the todo list are processed by netdev_run_todo().
netdev_run_todo() waits devices' reference count become 1 while
rebdoadcasting NETDEV_UNREGISTER notification.
When cfusbl_device_notify() is called with NETDEV_UNREGISTER multiple
times, the parent device might be freed. This could cause UAF.
Processing NETDEV_UNREGISTER multiple times also causes inbalance of
reference count for the module.
This patch fixes the issue by accepting only first NETDEV_UNREGISTER
notification.
Fixes: 7ad65bf68d70 ("caif: Add support for CAIF over CDC NCM USB interface")
CC: sjur.brandeland@stericsson.com <sjur.brandeland@stericsson.com>
Reported-by: syzbot+b563d33852b893653a9e@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=c3bfd8e2450adab3bffe4d80821fbbced600407f [1]
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Link: https://lore.kernel.org/r/20230301163913.391304-1-syoshida@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
from the MAC driver
Move the LAN7800 internal phy (phy ID 0x0007c132) specific register
accesses to the phy driver (microchip.c).
Fix the error reported by Enguerrand de Ribaucourt in December 2022,
"Some operations during the cable switch workaround modify the register
LAN88XX_INT_MASK of the PHY. However, this register is specific to the
LAN8835 PHY. For instance, if a DP8322I PHY is connected to the LAN7801,
that register (0x19), corresponds to the LED and MAC address
configuration, resulting in unapropriate behavior."
I did not test with the DP8322I PHY, but I tested with an EVB-LAN7800
with the internal PHY.
Fixes: 14437e3fa284 ("lan78xx: workaround of forced 100 Full/Half duplex mode error")
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230301154307.30438-1-yuiko.oshino@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Avoid crashing the machine by checking
info->attrs[NL802154_ATTR_SCAN_TYPE] presence before de-referencing it,
which was the primary intend of the blamed patch.
Reported-by: Sanan Hasanov <sanan.hasanov@Knights.ucf.edu>
Suggested-by: Eric Dumazet <edumazet@google.com>
Fixes: a0b6106672b5 ("ieee802154: Convert scan error messages to extack")
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/r/20230301154450.547716-1-miquel.raynal@bootlin.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
This patch fixes a buffer overflow access of skb->data if
ieee802154_hdr_peek_addrs() fails.
Reported-by: lianhui tang <bluetlh@gmail.com>
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20230217042504.3303396-1-aahringo@redhat.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
|
|
Florian reported a regression and sent a patch with the following
changelog:
<quote>
There is a noticeable tcp performance regression (loopback or cross-netns),
seen with iperf3 -Z (sendfile mode) when generic retpolines are needed.
With SK_RECLAIM_THRESHOLD checks gone number of calls to enter/leave
memory pressure happen much more often. For TCP indirect calls are
used.
We can't remove the if-set-return short-circuit check in
tcp_enter_memory_pressure because there are callers other than
sk_enter_memory_pressure. Doing a check in the sk wrapper too
reduces the indirect calls enough to recover some performance.
Before,
0.00-60.00 sec 322 GBytes 46.1 Gbits/sec receiver
After:
0.00-60.04 sec 359 GBytes 51.4 Gbits/sec receiver
"iperf3 -c $peer -t 60 -Z -f g", connected via veth in another netns.
</quote>
It seems we forgot to upstream this indirect call mitigation we
had for years, lets do this instead.
[edumazet] - It seems we forgot to upstream this indirect call
mitigation we had for years, let's do this instead.
- Changed to INDIRECT_CALL_INET_1() to avoid bots reports.
Fixes: 4890b686f408 ("net: keep sk->sk_forward_alloc as small as possible")
Reported-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/netdev/20230227152741.4a53634b@kernel.org/T/
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230301133247.2346111-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Fix bogus error report in selftests/netfilter/nft_nat.sh,
from Hangbin Liu.
2) Initialize last and quota expressions from template when
expr_ops::clone is called, otherwise, states are not restored
accordingly when loading a dynamic set with elements using
these two expressions.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_quota: copy content when cloning expression
netfilter: nft_last: copy content when cloning expression
selftests: nft_nat: ensuring the listening side is up before starting the client
====================
Link: https://lore.kernel.org/r/20230301222021.154670-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|