Age | Commit message (Collapse) | Author | Files | Lines |
|
Eric Dumazet suggests:
> The fact that mptcp_is_tcpsk() was able to write over sock->ops was a
> bit strange to me.
> mptcp_is_tcpsk() should answer a question, with a read-only argument.
re-factor code to avoid overwriting sock_ops inside that function. Also,
change the helper name to reflect the semantics and to disambiguate from
its dual, sk_is_mptcp(). While at it, collapse mptcp_stream_accept() and
mptcp_accept() into a single function, where fallback / non-fallback are
separated into a single sk_is_mptcp() conditional.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/432
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Better align function variables to open parenthesis as suggested by
checkpatch script for qca808x function to make code cleaner.
For cable_test_get_status function some additional rework was needed to
handle too long functions.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Victor Nogueira says:
====================
net/sched: Introduce tc block ports tracking and use
__context__
The "tc block" is a collection of netdevs/ports which allow qdiscs to share
match-action block instances (as opposed to the traditional tc filter per
netdev/port)[1].
Up to this point in the implementation, the block is unaware of its ports.
This patch makes the tc block ports available to the datapath.
For the datapath we provide a use case of the tc block in a mirred
action in patch 3. For users can levarage mirred to do something like
the following:
$ tc qdisc add dev ens7 ingress_block 22 clsact
$ tc qdisc add dev ens8 ingress_block 22 clsact
$ tc qdisc add dev ens9 ingress_block 22 clsact
$ tc filter add block 22 protocol ip pref 25 \
flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22
In this case, if the packet arrives on ens8, it will be copied and sent to
all ports in the block excluding ens8. Note that the packet is still in
the pipeline at this point - meaning other actions could be added after the
mirror because mirred copies/clones the skb. Example the following is
valid:
$ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 \
action mirred egress mirror blockid 22 \
action vlan push id 123 \
action mirred egress redirect dev dummy0
redirect behavior always steals the packet from the pipeline and therefore
the skb is no longer available for a subsequent action as illustrated above
(in redirecting to dummy0).
The behavior of redirecting to a tc block is therefore adapted to work in
the same manner. So a setup as such:
$ tc qdisc add dev ens7 ingress_block 22
$ tc qdisc add dev ens8 ingress_block 22
$ tc qdisc add dev ens9 ingress_block 22
$ tc filter add block 22 protocol ip pref 25 \
flower dst_ip 192.168.0.0/16 action mirred egress redirect blockid 22
for a matching packet arriving on ens7 will first send a copy/clone to ens8
(as in the "mirror" behavior) then to ens9 as in the redirect behavior
above. Once this processing is done - no other actions are able to process
this skb. i.e it is removed from the "pipeline".
In this case, if the packet arrives on ens8, it will be copied and sent to
all ports in the block excluding ens8.
Patch 1 separates/exports mirror and redirect functions from act_mirred
Patch 2 introduces the required infra.
Patch 3 Allows mirred to blocks
Subsequent patches will come with tdc test cases.
__Acknowledgements__
Suggestions from Vlad Buslov and Marcelo Ricardo Leitner made this patchset
better. The idea of integrating the ports into the tc block was suggested
by Jiri Pirko.
[1] See commit ca46abd6f89f ("Merge branch'net-sched-allow-qdiscs-to-share-filter-block-instances'")
Changes in v2:
- Remove RFC tag
- Add more details in patch 0(Jiri)
- When CONFIG_NET_TC_SKB_EXT is selected we have unused qdisc_cb
Reported-by: kernel test robot <lkp@intel.com> (and
horms@kernel.org)
- Fix bad dev dereference in printk of blockcast action (Simon)
Changes in v3:
- Add missing xa_destroy (pointed out by Vlad)
- Remove bugfix pointed by Vlad (will send in separate patch)
- Removed ports from subject in patch #2 and typos (suggested by
Marcelo)
- Remove net_notice_ratelimited debug messages in error
cases (suggested by Marcelo)
- Minor changes to appease sparse's lock context warning
Changes in v4:
- Avoid code repetition using gotos in cast_one (suggested by Paolo)
- Fix typo in cover letter (pointed out by Paolo)
- Create a module description for act_blockcast
(reported by Paolo and CI)
Changes in v5:
- Add new patch which separated mirred into mirror and redirect
functions (suggested by Jiri)
- Instead of repeating the code to mirror in blockcast use mirror
exported function by patch1 (tcf_mirror_act)
- Make Block ID into act_blockcast's parameter passed by user space
instead of always getting it from SKB (suggested by Jiri)
- Add tx_type parameter which will specify what transmission behaviour
we want (as described earlier)
Changes in v6:
- Remove blockcast and make it a part of mirred (suggestd by Jiri)
- Block ID is now a mirred parameter
- We now allow redirecting and mirroring to either ingress or egress
Changes in v7:
- Remove set but not used variable in tcf_mirred_act (pointed out by
Jakub)
Changes in v8:
- Fix uapi issues (pointed out by Jiri)
- Separate last patch into 3 - two as preparations for adding
block ID to mirred and one allowing mirred to block (suggested by Jiri)
- Remove declaration initialisation of eg_block and in_block in
qdisc_block_add_dev (suggested by Jiri)
- Avoid unnecessary if guards in qdisc_block_add_dev (suggested by Jiri)
- Remove unncessary block_index retrieval in __qdisc_destroy
(suggested by Jiri)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
So far the mirred action has dealt with syntax that handles
mirror/redirection for netdev. A matching packet is redirected or mirrored
to a target netdev.
In this patch we enable mirred to mirror to a tc block as well.
IOW, the new syntax looks as follows:
... mirred <ingress | egress> <mirror | redirect> [index INDEX] < <blockid BLOCKID> | <dev <devname>> >
Examples of mirroring or redirecting to a tc block:
$ tc filter add block 22 protocol ip pref 25 \
flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22
$ tc filter add block 22 protocol ip pref 25 \
flower dst_ip 10.10.10.10/32 action mirred egress redirect blockid 22
Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The act of replacing a device will be repeated by the init logic for the
block ID in the patch that allows mirred to a block. Therefore we
encapsulate this functionality in a function (tcf_mirred_replace_dev) so
that we can reuse it and avoid code repetition.
Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As a preparation for adding block ID to mirred, separate the part of
mirred that redirect/mirrors to a dev into a specific function so that it
can be called by blockcast for each dev.
Also improve readability. Eg. rename use_reinsert to dont_clone and skb2
to skb_to_send.
Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The datapath can now find the block of the port in which the packet arrived
at.
In the next patch we show a possible usage of this patch in a new
version of mirred that multicasts to all ports except for the port in
which the packet arrived on.
Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit makes tc blocks track which ports have been added to them.
And, with that, we'll be able to use this new information to send
packets to the block's ports. Which will be done in the patch #3 of this
series.
Suggested-by: Jiri Pirko <jiri@nvidia.com>
Co-developed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Co-developed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The dns_resolver_preparse() function has a check on the size of the
payload for the basic header of the binary-style payload, but is missing
a check for the size of the V1 server-list payload header after
determining that's what we've been given.
Fix this by getting rid of the the pointer to the basic header and just
assuming that we have a V1 server-list payload and moving the V1 server
list pointer inside the if-statement. Dealing with other types and
versions can be left for when such have been defined.
This can be tested by doing the following with KASAN enabled:
echo -n -e '\x0\x0\x1\x2' | keyctl padd dns_resolver foo @p
and produces an oops like the following:
BUG: KASAN: slab-out-of-bounds in dns_resolver_preparse+0xc9f/0xd60 net/dns_resolver/dns_key.c:127
Read of size 1 at addr ffff888028894084 by task syz-executor265/5069
...
Call Trace:
dns_resolver_preparse+0xc9f/0xd60 net/dns_resolver/dns_key.c:127
__key_create_or_update+0x453/0xdf0 security/keys/key.c:842
key_create_or_update+0x42/0x50 security/keys/key.c:1007
__do_sys_add_key+0x29c/0x450 security/keys/keyctl.c:134
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x62/0x6a
This patch was originally by Edward Adam Davis, but was modified by
Linus.
Fixes: b946001d3bb1 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry")
Reported-and-tested-by: syzbot+94bbb75204a05da3d89f@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/0000000000009b39bc060c73e209@google.com/
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Cc: Edward Adam Davis <eadavis@qq.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jeffrey E Altman <jaltman@auristor.com>
Cc: Wang Lei <wang840925@gmail.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Steve French <sfrench@us.ibm.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since there are no more users of the macro let's finally
burn it
Signed-off-by: Denis Kirjanov <dkirjanov@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SOCK_DEBUG comes from the old days. Let's
move logging to standard net core ratelimited logging functions
Signed-off-by: Denis Kirjanov <dkirjanov@suse.de>
changes in v2:
- remove SOCK_DEBUG macro altogether
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Couple of structures was not marked as __packed. This patch
fixes the same and mark them as __packed.
Fixes: 42006910b5ea ("octeontx2-af: cleanup KPU config data")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Wen Gu says:
====================
net/smc: implement SMCv2.1 virtual ISM device support
The fourth edition of SMCv2 adds the SMC version 2.1 feature updates for
SMC-Dv2 with virtual ISM. Virtual ISM are created and supported mainly by
OS or hypervisor software, comparable to IBM ISM which is based on platform
firmware or hardware.
With the introduction of virtual ISM, SMCv2.1 makes some updates:
- Introduce feature bitmask to indicate supplemental features.
- Reserve a range of CHIDs for virtual ISM.
- Support extended GIDs (128 bits) in CLC handshake.
So this patch set aims to implement these updates in Linux kernel. And it
acts as the first part of SMC-D virtual ISM extension & loopback-ism [1].
[1] https://lore.kernel.org/netdev/1695568613-125057-1-git-send-email-guwen@linux.alibaba.com/
v8->v7:
- Patch #7: v7 mistakenly changed the type of gid_ext in
smc_clc_msg_accept_confirm to u64 instead of __be64 as previous versions
when fixing the rebase conflicts. So fix this mistake.
v7->v6:
Link: https://lore.kernel.org/netdev/20231219084536.8158-1-guwen@linux.alibaba.com/
- Collect the Reviewed-by tag in v6;
- Patch #3: redefine the struct smc_clc_msg_accept_confirm;
- Patch #7: Because that the Patch #3 already adds '__packed' to
smc_clc_msg_accept_confirm, so Patch #7 doesn't need to do the same thing.
But this is a minor change, so I kept the 'Reviewed-by' tag.
Other changes in previous versions but not yet acked:
- Patch #1: Some minor changes in subject and fix the format issue
(length exceeds 80 columns) compared to v3.
- Patch #5: removes useless ini->feature_mask assignment in __smc_connect()
and smc_listen_v2_check() compared to v4.
- Patch #8: new added, compared to v3.
v6->v5:
Link: https://lore.kernel.org/netdev/1702371151-125258-1-git-send-email-guwen@linux.alibaba.com/
- Add 'Reviewed-by' label given in the previous versions:
* Patch #4, #6, #9, #10 have nothing changed since v3;
- Patch #2:
* fix the format issue (Alignment should match open parenthesis) compared to v5;
* remove useless clc->hdr.length assignment in smcr_clc_prep_confirm_accept()
compared to v5;
- Patch #3: new added compared to v5.
- Patch #7: some minor changes like aclc_v2->aclc or clc_v2->clc compared to v5
due to the introduction of Patch #3. Since there were no major changes, I kept
the 'Reviewed-by' label.
Other changes in previous versions but not yet acked:
- Patch #1: Some minor changes in subject and fix the format issue
(length exceeds 80 columns) compared to v3.
- Patch #5: removes useless ini->feature_mask assignment in __smc_connect()
and smc_listen_v2_check() compared to v4.
- Patch #8: new added, compared to v3.
v5->v4:
Link: https://lore.kernel.org/netdev/1702021259-41504-1-git-send-email-guwen@linux.alibaba.com/
- Patch #6: improve the comment of SMCD_CLC_MAX_V2_GID_ENTRIES;
- Patch #4: remove useless ini->feature_mask assignment;
v4->v3:
https://lore.kernel.org/netdev/1701920994-73705-1-git-send-email-guwen@linux.alibaba.com/
- Patch #6: use SMCD_CLC_MAX_V2_GID_ENTRIES to indicate the max gid
entries in CLC proposal and using SMC_MAX_V2_ISM_DEVS to indicate the
max devices to propose;
- Patch #6: use i and i+1 in smc_find_ism_v2_device_serv();
- Patch #2: replace the large if-else block in smc_clc_send_confirm_accept()
with 2 subfunctions;
- Fix missing byte order conversion of GID and token in CLC handshake,
which is in a separate patch sending to net:
https://lore.kernel.org/netdev/1701882157-87956-1-git-send-email-guwen@linux.alibaba.com/
- Patch #7: add extended GID in SMC-D lgr netlink attribute;
v3->v2:
https://lore.kernel.org/netdev/1701343695-122657-1-git-send-email-guwen@linux.alibaba.com/
- Rename smc_clc_fill_fce as smc_clc_fill_fce_v2x;
- Remove ISM_IDENT_MASK from drivers/s390/net/ism.h;
- Add explicitly assigning 'false' to ism_v2_capable in ism_dev_init();
- Remove smc_ism_set_v2_capable() helper for now, and introduce it in
later loopback-ism implementation;
v2->v1:
- Fix sparse complaint;
- Rebase to the latest net-next;
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The System EID (SEID) is an internal EID that is used by the SMCv2
software stack that has a predefined and constant value representing
the s390 physical machine that the OS is executing on. So it should
be managed by SMC stack instead of ISM driver and be consistent for
all ISMv2 device (including virtual ISM devices) on s390 architecture.
Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The system EID (SEID) is an internal EID used by SMC-D to represent the
s390 physical machine that OS is executing on. On s390 architecture, it
predefined by fixed string and part of cpuid and is enabled regardless
of whether underlay device is virtual ISM or platform firmware ISM.
However on non-s390 architectures where SMC-D can be used with virtual
ISM devices, there is no similar information to identify physical
machines, especially in virtualization scenarios. So in such cases, SEID
is forcibly disabled and the user-defined UEID will be used to represent
the communicable space.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Virtual ISM devices introduced in SMCv2.1 requires a 128 bit extended
GID vs. the existing ISM 64bit GID. So the 2nd 64 bit of extended GID
should be included in SMC-D linkgroup netlink attribute as well.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to virtual ISM support feature defined by SMCv2.1, GIDs of
virtual ISM device are UUIDs defined by RFC4122, which are 128-bits
long. So some adaptation work is required. And note that the GIDs of
existing platform firmware ISM devices still remain 64-bits long.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to virtual ISM support feature defined by SMCv2.1, CHIDs in
the range 0xFF00 to 0xFFFF are reserved for use by virtual ISM devices.
And two helpers are introduced to distinguish virtual ISM devices from
the existing platform firmware ISM devices.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This introduces virtual ISM device support feature to SMCv2.1 as the
first supplemental feature.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds SMCv2.x supplemental features negotiation. Supported
SMCv2.x supplemental features are represented by feature_mask in FCE
field. The negotiation process is as follows.
Server Client
Proposal(features(c-mask bits))
<-----------------------------------------
Accept(features(s-mask bits))
----------------------------------------->
Confirm(features(s&c-mask bits))
<-----------------------------------------
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The structs of CLC accept and confirm messages for SMCv1 and SMCv2 are
separately defined and often casted to each other in the code, which may
increase the risk of errors caused by future divergence of them. So
unify them into one struct for better maintainability.
Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a large if-else block in smc_clc_send_confirm_accept() and it
is better to split it into two sub-functions.
Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rename some functions or variables with 'fce' in their name but used in
SMCv2.1 as 'fce_v2x' for clarity.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Size of the virtchnl2_rss_key struct should be 7 bytes but the
compiler introduces a padding byte for the structure alignment.
This results in idpf sending an additional byte of memory to the device
control plane than the expected buffer size. As the control plane
enforces virtchnl message size checks to validate the message,
set RSS key message fails resulting in the driver load failure.
Remove implicit compiler padding by using "__packed" structure
attribute for the virtchnl2_rss_key struct.
Also there is no need to use __DECLARE_FLEX_ARRAY macro for the
'key_flex' struct field. So drop it.
Fixes: 0d7502a9b4a7 ("virtchnl: add virtchnl version 2 ops")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Scott Register <scott.register@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
idpf_ring::skb serves only for keeping an incomplete frame between
several NAPI Rx polling cycles, as one cycle may end up before
processing the end of packet descriptor. The pointer is taken from
the ring onto the stack before entering the loop and gets written
there after the loop exits. When inside the loop, only the onstack
pointer is used.
For some reason, the logics is broken in the singleq mode, where the
pointer is taken from the ring each iteration. This means that if a
frame got fragmented into several descriptors, each fragment will have
its own skb, but only the last one will be passed up the stack
(containing garbage), leaving the rest leaked.
Then, on ifdown, rxq::skb is being freed only in the splitq mode, while
it can point to a valid skb in singleq as well. This can lead to a yet
another skb leak.
Just don't touch the ring skb field inside the polling loop, letting
the onstack skb pointer work as expected: build a new skb if it's the
first frame descriptor and attach a frag otherwise. On ifdown, free
rxq::skb unconditionally if the pointer is non-NULL.
Fixes: a5ab9ee0df0b ("idpf: add singleq start_xmit and napi poll")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Scott Register <scott.register@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
For the QUEUE_FLAG_HW_WC to actually work, it needs to have a separate
number from QUEUE_FLAG_FUA, doh.
Fixes: 43c9835b144c ("block: don't allow enabling a cache on devices that don't support it")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231226081524.180289-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull virtio fixes from Michael Tsirkin:
"A couple of bugfixes: one for a regression"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_blk: fix snprintf truncation compiler warning
virtio_ring: fix syncs DMA memory with different direction
|
|
@ 2023-12-19 17:49 Siddh Raman Pant
2023-12-19 17:49 ` [PATCH net-next v7 1/2] nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_local Siddh Raman Pant
2023-12-19 17:49 ` [PATCH net-next v7 2/2] nfc: Do not send datagram if socket state isn't LLCP_BOUND Siddh Raman Pant
0 siblings, 2 replies; 4+ messages in thread
Siddh Raman Pant says:
====================
[PATCH net-next v7 0/2] nfc: Fix UAF during datagram sending caused by missing refcounting
Changes in v7:
- Stupidly reverted ordering in recv() too, fix that.
- Remove redundant call to nfc_llcp_sock_free().
Changes in v6:
- Revert label introduction from v4, and thus also v5 entirely.
Changes in v5:
- Move reason = LLCP_DM_REJ under the fail_put_sock label.
- Checkpatch now warns about == NULL check for new_sk, so fix that,
and also at other similar places in the same function.
Changes in v4:
- Fix put ordering and comments.
- Separate freeing in recv() into end labels.
- Remove obvious comment and add reasoning.
- Picked up r-bs by Suman.
Changes in v3:
- Fix missing freeing statements.
Changes in v2:
- Add net-next in patch subject.
- Removed unnecessary extra lock and hold nfc_dev ref when holding llcp_sock.
- Remove last formatting patch.
- Picked up r-b from Krzysztof for LLCP_BOUND patch.
---
For connectionless transmission, llcp_sock_sendmsg() codepath will
eventually call nfc_alloc_send_skb() which takes in an nfc_dev as
an argument for calculating the total size for skb allocation.
virtual_ncidev_close() codepath eventually releases socket by calling
nfc_llcp_socket_release() (which sets the sk->sk_state to LLCP_CLOSED)
and afterwards the nfc_dev will be eventually freed.
When an ndev gets freed, llcp_sock_sendmsg() will result in an
use-after-free as it
(1) doesn't have any checks in place for avoiding the datagram sending.
(2) calls nfc_llcp_send_ui_frame(), which also has a do-while loop
which can race with freeing. This loop contains the call to
nfc_alloc_send_skb() where we dereference the nfc_dev pointer.
nfc_dev is being freed because we do not hold a reference to it when
we hold a reference to llcp_local. Thus, virtual_ncidev_close()
eventually calls nfc_release() due to refcount going to 0.
Since state has to be LLCP_BOUND for datagram sending, we can bail out
early in llcp_sock_sendmsg().
Please review and let me know if any errors are there, and hopefully
this gets accepted.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As we know we cannot send the datagram (state can be set to LLCP_CLOSED
by nfc_llcp_socket_release()), there is no need to proceed further.
Thus, bail out early from llcp_sock_sendmsg().
Signed-off-by: Siddh Raman Pant <code@siddh.me>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
llcp_sock_sendmsg() calls nfc_llcp_send_ui_frame() which in turn calls
nfc_alloc_send_skb(), which accesses the nfc_dev from the llcp_sock for
getting the headroom and tailroom needed for skb allocation.
Parallelly the nfc_dev can be freed, as the refcount is decreased via
nfc_free_device(), leading to a UAF reported by Syzkaller, which can
be summarized as follows:
(1) llcp_sock_sendmsg() -> nfc_llcp_send_ui_frame()
-> nfc_alloc_send_skb() -> Dereference *nfc_dev
(2) virtual_ncidev_close() -> nci_free_device() -> nfc_free_device()
-> put_device() -> nfc_release() -> Free *nfc_dev
When a reference to llcp_local is acquired, we do not acquire the same
for the nfc_dev. This leads to freeing even when the llcp_local is in
use, and this is the case with the UAF described above too.
Thus, when we acquire a reference to llcp_local, we should acquire a
reference to nfc_dev, and release the references appropriately later.
References for llcp_local is initialized in nfc_llcp_register_device()
(which is called by nfc_register_device()). Thus, we should acquire a
reference to nfc_dev there.
nfc_unregister_device() calls nfc_llcp_unregister_device() which in
turn calls nfc_llcp_local_put(). Thus, the reference to nfc_dev is
appropriately released later.
Reported-and-tested-by: syzbot+bbe84a4010eeea00982d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bbe84a4010eeea00982d
Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket")
Reviewed-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: Siddh Raman Pant <code@siddh.me>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 2f3ce7a56c6e ("net: sfp: rework the RollBall PHY waiting code")
changed the long wait before accessing RollBall / FS modules into
probing for PHY every 1 second, and trying 25 times.
Wei Lei reports that this does not work correctly on FS modules: when
initializing, they may report values different from 0xffff in PHY ID
registers for some MMDs, causing get_phy_c45_ids() to find some bogus
MMD.
Fix this by adding the module_t_wait member back, and setting it to 4
seconds for FS modules.
Fixes: 2f3ce7a56c6e ("net: sfp: rework the RollBall PHY waiting code")
Reported-by: Wei Lei <quic_leiwei@quicinc.com>
Signed-off-by: Marek Behún <kabel@kernel.org>
Tested-by: Lei Wei <quic_leiwei@quicinc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If prev_badblocks() returns '-1', it means no valid badblocks record
before the checking range. It doesn't make sense to check whether
the input checking range is overlapped with the non-existed invalid
front range.
This patch checkes whether 'prev >= 0' is true before calling
overlap_front(), to void such invalid operations.
Fixes: 3ea3354cb9f0 ("badblocks: improve badblocks_check() for multiple ranges handling")
Reported-and-tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/nvdimm/3035e75a-9be0-4bc3-8d4a-6e52c207f277@leemhuis.info/
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Geliang Tang <geliang.tang@suse.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: NeilBrown <neilb@suse.de>
Cc: Vishal L Verma <vishal.l.verma@intel.com>
Cc: Xiao Ni <xni@redhat.com>
Link: https://lore.kernel.org/r/20231224002820.20234-1-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix a secondary CPUs enumeration regression caused by creative MADT
APIC table entries on certain systems.
- Fix a race in the NOP-patcher that can spuriously trigger crashes on
bootup.
- Fix a bootup failure regression caused by the parallel bringup code,
caused by firmware inconsistency between the APIC initialization
states of the boot and secondary CPUs, on certain systems.
* tag 'x86-urgent-2023-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/acpi: Handle bogus MADT APIC tables gracefully
x86/alternatives: Disable interrupts and sync when optimizing NOPs in place
x86/alternatives: Sync core before enabling interrupts
x86/smpboot/64: Handle X2APIC BIOS inconsistency gracefully
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four small fixes, three in drivers with the core one adding a batch
indicator (for drivers which use it) to the error handler"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Let the sq_lock protect sq_tail_slot access
scsi: ufs: qcom: Return ufs_qcom_clk_scale_*() errors in ufs_qcom_clk_scale_notify()
scsi: core: Always send batch on reset or error handling command
scsi: bnx2fc: Fix skb double free in bnx2fc_rcv()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt fixes from Greg KH:
"Here are some small bugfixes and new device ids for USB and
Thunderbolt drivers for 6.7-rc7. Included in here are:
- new usb-serial device ids
- thunderbolt driver fixes
- typec driver fix
- usb-storage driver quirk added
- fotg210 driver fix
All of these have been in linux-next with no reported issues"
* tag 'usb-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: option: add Quectel EG912Y module support
USB: serial: ftdi_sio: update Actisense PIDs constant names
usb: fotg210-hcd: delete an incorrect bounds test
usb-storage: Add quirk for incorrect WP on Kingston DT Ultimate 3.0 G3
usb: typec: ucsi: fix gpio-based orientation detection
net: usb: ax88179_178a: avoid failed operations when device is disconnected
USB: serial: option: add Quectel RM500Q R13 firmware support
USB: serial: option: add Foxconn T99W265 with new baseline
thunderbolt: Fix minimum allocated USB 3.x and PCIe bandwidth
thunderbolt: Fix memory leak in margining_port_remove()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver fixes from Greg KH:
"Here are a small number of various driver fixes for 6.7-rc7 that
normally come through the char-misc tree, and one debugfs fix as well.
Included in here are:
- iio and hid sensor driver fixes for a number of small things
- interconnect driver fixes
- brcm_nvmem driver fixes
- debugfs fix for previous fix
- guard() definition in device.h so that many subsystems can start
using it for 6.8-rc1 (requested by Dan Williams to make future
merges easier)
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
debugfs: initialize cancellations earlier
Revert "iio: hid-sensor-als: Add light color temperature support"
Revert "iio: hid-sensor-als: Add light chromaticity support"
nvmem: brcm_nvram: store a copy of NVRAM content
dt-bindings: nvmem: mxs-ocotp: Document fsl,ocotp
driver core: Add a guard() definition for the device_lock()
interconnect: qcom: icc-rpm: Fix peak rate calculation
iio: adc: MCP3564: fix hardware identification logic
iio: adc: MCP3564: fix calib_bias and calib_scale range checks
iio: adc: meson: add separate config for axg SoC family
iio: adc: imx93: add four channels for imx93 adc
iio: adc: ti_am335x_adc: Fix return value check of tiadc_request_dma()
interconnect: qcom: sm8250: Enable sync_state
iio: triggered-buffer: prevent possible freeing of wrong buffer
iio: imu: inv_mpu6050: fix an error code problem in inv_mpu6050_read_raw
iio: imu: adis16475: use bit numbers in assign_bit()
iio: imu: adis16475: add spi_device_id table
iio: tmag5273: fix temperature offset
interconnect: Treat xlate() returning NULL node as an error
iio: common: ms_sensors: ms_sensors_i2c: fix humidity conversion time table
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- a quirk to AT keyboard driver to skip issuing "GET ID" command when
8042 is in translated mode and the device is a laptop/portable,
because the "GET ID" command makes a bunch of recent laptops unhappy
- a quirk to i8042 to disable multiplexed mode on Acer P459-G2-M which
causes issues on resume
- psmouse will activate native RMI4 protocol support for touchpad on
ThinkPad L14 G1
- addition of Razer Wolverine V2 ID to xpad gamepad driver
- mapping for airplane mode button in soc_button_array driver for
TUXEDO laptops
- improved error handling in ipaq-micro-keys driver
- amimouse being prepared for platform remove callback returning void
* tag 'input-for-v6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: soc_button_array - add mapping for airplane mode button
Input: xpad - add Razer Wolverine V2 support
Input: ipaq-micro-keys - add error handling for devm_kmemdup
Input: amimouse - convert to platform remove callback returning void
Input: i8042 - add nomux quirk for Acer P459-G2-M
Input: atkbd - skip ATKBD_CMD_GETID in translated mode
Input: psmouse - enable Synaptics InterTouch for ThinkPad L14 G1
|
|
Commit 56769ba4b297 ("kbuild: unify vdso_install rules") accidentally
dropped the '.debug' suffix from the build ID symlinks.
Fixes: 56769ba4b297 ("kbuild: unify vdso_install rules")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
When a path contains relative symbolic links, os.path.abspath() might
not follow the symlinks and instead return the absolute path with just
the relative paths resolved, resulting in an incorrect path.
1. Say "drivers/hdf/" has some symlinks:
# ls -l drivers/hdf/
total 364
drwxrwxr-x 2 ... 4096 ... evdev
lrwxrwxrwx 1 ... 44 ... framework -> ../../../../../../drivers/hdf_core/framework
-rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h
lrwxrwxrwx 1 ... 55 ... inner_api -> ../../../../../../drivers/hdf_core/interfaces/inner_api
lrwxrwxrwx 1 ... 53 ... khdf -> ../../../../../../drivers/hdf_core/adapter/khdf/linux
-rw-r--r-- 1 ... 74 ... Makefile
drwxrwxr-x 3 ... 4096 ... wifi
2. One .cmd file records that:
# head -1 ./framework/core/manager/src/.devmgr_service.o.cmd
cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.o := ... \
/path/to/src/drivers/hdf/khdf/manager/../../../../framework/core/manager/src/devmgr_service.c
3. os.path.abspath returns "/path/to/src/framework/core/manager/src/devmgr_service.c", not correct:
# ./scripts/clang-tools/gen_compile_commands.py
INFO: Could not add line from ./framework/core/manager/src/.devmgr_service.o.cmd: File \
/path/to/src/framework/core/manager/src/devmgr_service.c not found
Use os.path.realpath(), which resolves the symlinks and normalizes the paths correctly.
# cat compile_commands.json
...
{
"command": ...
"directory": ...
"file": "/path/to/bla/drivers/hdf_core/framework/core/manager/src/devmgr_service.c"
},
...
Also fix it in parse_arguments().
Signed-off-by: Jialu Xu <xujialu@vimux.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Masahiro has always applied scripts/clang-tools patches but it is not
included in the Kbuild section, so neither he nor linux-kbuild get cc'd
on patches that touch those files.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
An alignment of 4 bytes is wrong for 64-bit platforms which don't define
CONFIG_HAVE_ARCH_PREL32_RELOCATIONS (which then store 64-bit pointers).
Fix their alignment to 8 bytes.
Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
This add a mapping for the airplane mode button on the TUXEDO Pulse Gen3.
While it is physically a key it behaves more like a switch, sending a key
down on first press and a key up on 2nd press. Therefor the switch event
is used here. Besides this behaviour it uses the HID usage-id 0xc6
(Wireless Radio Button) and not 0xc8 (Wireless Radio Slider Switch), but
since neither 0xc6 nor 0xc8 are currently implemented at all in
soc_button_array this not to standard behaviour is not put behind a quirk
for the moment.
Signed-off-by: Christoffer Sandberg <cs@tuxedo.de>
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Link: https://lore.kernel.org/r/20231215171718.80229-1-wse@tuxedocomputers.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Pull block fixes from Jens Axboe:
"Just an NVMe pull request this time, with a fix for bad sleeping
context, and a revert of a patch that caused some trouble"
* tag 'block-6.7-2023-12-22' of git://git.kernel.dk/linux:
nvme-pci: fix sleeping function called from interrupt context
Revert "nvme-fc: fix race between error recovery and creating association"
|
|
Pull kvm fixes from Paolo Bonzini:
"RISC-V:
- Fix a race condition in updating external interrupt for
trap-n-emulated IMSIC swfile
- Fix print_reg defaults in get-reg-list selftest
ARM:
- Ensure a vCPU's redistributor is unregistered from the MMIO bus if
vCPU creation fails
- Fix building KVM selftests for arm64 from the top-level Makefile
x86:
- Fix breakage for SEV-ES guests that use XSAVES
Selftests:
- Fix bad use of strcat(), by not using strcat() at all"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests
KVM: selftests: Fix dynamic generation of configuration names
RISCV: KVM: update external interrupt atomically for IMSIC swfile
KVM: riscv: selftests: Fix get-reg-list print_reg defaults
KVM: selftests: Ensure sysreg-defs.h is generated at the expected path
KVM: Convert comment into an assertion in kvm_io_bus_register_dev()
KVM: arm64: vgic: Ensure that slots_lock is held in vgic_register_all_redist_iodevs()
KVM: arm64: vgic: Force vcpu vgic teardown on vcpu destroy
KVM: arm64: vgic: Add a non-locking primitive for kvm_vgic_vcpu_destroy()
KVM: arm64: vgic: Simplify kvm_vgic_destroy()
|
|
Ioana Ciornei says:
====================
dpaa2-switch: small improvements
This patch set consists of a series of small improvements on the
dpaa2-switch driver ranging from adding some more verbosity when
encountering errors to reorganizing code to be easily extensible.
Changes in v3:
- 4/8: removed the fixes tag and moved it to the commit message
- 5/8: specified that there is no user-visible effect
- 6/8: removed the initialization of the err variable
Changes in v2:
- No changes to the actual diff, only rephrased some commit messages and
added more information.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In case a DPAA2 switch interface joins a bridge, the FDB used on the
port will be changed to the one associated with the bridge. What this
means exactly is that any VLAN installed on the port will need to be
removed and then installed back so that it points to the new FDB.
Once this is done, the previous FDB will become unused (no VLAN to
point to it). Even though no traffic will reach this FDB, it's best to
just cleanup the state of the FDB by zeroing its egress flood domain.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Two different DPAA2 switch ports from two different DPSW instances
cannot be under the same bridge. Instead of checking for this
unsupported configuration in the CHANGEUPPER event, check it as early as
possible in the PRECHANGEUPPER one.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Create separate functions, dpaa2_switch_port_prechangeupper and
dpaa2_switch_port_changeupper, to be called directly when a DPSW port
changes its upper device.
This way we are not open-coding everything in the main event callback
and we can easily extent, for example, with bond offload.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The DPSW object has multiple event sources multiplexed over the same
IRQ. The driver has the capability to configure only some of these
events to trigger the IRQ.
The dpsw_get_irq_status() can clear events automatically based on the
value stored in the 'status' variable passed to it. We don't want that
to happen because we could get into a situation when we are clearing
more events than we actually handled.
Just resort to manually clearing the events that we handled. Also, since
status is not used on the out path we remove its initialization to zero.
This change does not have a user-visible effect because the dpaa2-switch
driver enables and handles all the DPSW events which exist at the
moment.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|