<feed xmlns='http://www.w3.org/2005/Atom'>
<title>starfive-tech/linux.git/include/linux/skbuff.h, branch VF2_v2.8.0</title>
<subtitle>StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)</subtitle>
<id>https://git.radix-linux.su/starfive-tech/linux.git/atom?h=VF2_v2.8.0</id>
<link rel='self' href='https://git.radix-linux.su/starfive-tech/linux.git/atom?h=VF2_v2.8.0'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/'/>
<updated>2021-09-09T09:57:52+00:00</updated>
<entry>
<title>net/af_unix: fix a data-race in unix_dgram_poll</title>
<updated>2021-09-09T09:57:52+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2021-09-09T00:00:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=04f08eb44b5011493d77b602fdec29ff0f5c6cd5'/>
<id>urn:sha1:04f08eb44b5011493d77b602fdec29ff0f5c6cd5</id>
<content type='text'>
syzbot reported another data-race in af_unix [1]

Lets change __skb_insert() to use WRITE_ONCE() when changing
skb head qlen.

Also, change unix_dgram_poll() to use lockless version
of unix_recvq_full()

It is verry possible we can switch all/most unix_recvq_full()
to the lockless version, this will be done in a future kernel version.

[1] HEAD commit: 8596e589b787732c8346f0482919e83cc9362db1

BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll

write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0:
 __skb_insert include/linux/skbuff.h:1938 [inline]
 __skb_queue_before include/linux/skbuff.h:2043 [inline]
 __skb_queue_tail include/linux/skbuff.h:2076 [inline]
 skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264
 unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850
 sock_sendmsg_nosec net/socket.c:703 [inline]
 sock_sendmsg net/socket.c:723 [inline]
 ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
 ___sys_sendmsg net/socket.c:2446 [inline]
 __sys_sendmmsg+0x315/0x4b0 net/socket.c:2532
 __do_sys_sendmmsg net/socket.c:2561 [inline]
 __se_sys_sendmmsg net/socket.c:2558 [inline]
 __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1:
 skb_queue_len include/linux/skbuff.h:1869 [inline]
 unix_recvq_full net/unix/af_unix.c:194 [inline]
 unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777
 sock_poll+0x23e/0x260 net/socket.c:1288
 vfs_poll include/linux/poll.h:90 [inline]
 ep_item_poll fs/eventpoll.c:846 [inline]
 ep_send_events fs/eventpoll.c:1683 [inline]
 ep_poll fs/eventpoll.c:1798 [inline]
 do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226
 __do_sys_epoll_wait fs/eventpoll.c:2238 [inline]
 __se_sys_epoll_wait fs/eventpoll.c:2233 [inline]
 __x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000001b -&gt; 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G        W         5.14.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 86b18aaa2b5b ("skbuff: fix a data race in skb_queue_len()")
Cc: Qian Cai &lt;cai@lca.pw&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>page_pool: keep pp info as long as page pool owns the page</title>
<updated>2021-08-09T22:49:00+00:00</updated>
<author>
<name>Yunsheng Lin</name>
<email>linyunsheng@huawei.com</email>
</author>
<published>2021-08-06T02:46:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=57f05bc2ab2443b89c2e2562c05053bcc7d30e8b'/>
<id>urn:sha1:57f05bc2ab2443b89c2e2562c05053bcc7d30e8b</id>
<content type='text'>
Currently, page-&gt;pp is cleared and set everytime the page
is recycled, which is unnecessary.

So only set the page-&gt;pp when the page is added to the page
pool and only clear it when the page is released from the
page pool.

This is also a preparation to support allocating frag page
in page pool.

Reviewed-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>skbuff: introduce skb_expand_head()</title>
<updated>2021-08-03T10:21:39+00:00</updated>
<author>
<name>Vasily Averin</name>
<email>vvs@virtuozzo.com</email>
</author>
<published>2021-08-02T08:52:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=f1260ff15a71b8fc122b2c9abd8a7abffb6e0168'/>
<id>urn:sha1:f1260ff15a71b8fc122b2c9abd8a7abffb6e0168</id>
<content type='text'>
Like skb_realloc_headroom(), new helper increases headroom of specified skb.
Unlike skb_realloc_headroom(), it does not allocate a new skb if possible;
copies skb-&gt;sk on new skb when as needed and frees original skb in case
of failures.

This helps to simplify ip[6]_finish_output2() and a few other similar cases.

Signed-off-by: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sk_buff: avoid potentially clearing 'slow_gro' field</title>
<updated>2021-07-30T18:48:06+00:00</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2021-07-30T16:30:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=a432934a30679c0e3c47b87f13e4901bc1a3fc03'/>
<id>urn:sha1:a432934a30679c0e3c47b87f13e4901bc1a3fc03</id>
<content type='text'>
If skb_dst_set_noref() is invoked with a NULL dst, the 'slow_gro'
field is cleared, too. That could lead to wrong behavior if
the skb later enters the GRO stage.

Fix the potential issue replacing preserving a non-zero value of
the 'slow_gro' field.

Additionally, fix a comment typo.

Reported-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Reported-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Fixes: 8a886b142bd0 ("sk_buff: track dst status in slow_gro")
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Link: https://lore.kernel.org/r/aa42529252dc8bb02bd42e8629427040d1058537.1627662501.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>sk_buff: track dst status in slow_gro</title>
<updated>2021-07-29T11:18:11+00:00</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2021-07-28T16:24:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=8a886b142bd03d36612747e9aefdf0282c8b02dd'/>
<id>urn:sha1:8a886b142bd03d36612747e9aefdf0282c8b02dd</id>
<content type='text'>
Similar to the previous patch, but covering the dst field:
the slow_gro flag is additionally set when a dst is attached
to the skb

RFC -&gt; v1:
 - use the existing flag instead of adding a new one

Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sk_buff: introduce 'slow_gro' flags</title>
<updated>2021-07-29T11:18:11+00:00</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2021-07-28T16:23:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=5fc88f93edf2f797f1aa63334cc6c86f9c15d585'/>
<id>urn:sha1:5fc88f93edf2f797f1aa63334cc6c86f9c15d585</id>
<content type='text'>
The new flag tracks if any state field is set, so that
GRO requires 'unusual'/slow prepare steps.

Set such flag when a ct entry is attached to the skb,
and never clear it.

The new bit uses an existing hole into the sk_buff struct

RFC -&gt; v1:
 - use a single state bit, never clear it
 - avoid moving the _nfct field

Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: cpumap: Implement generic cpumap</title>
<updated>2021-07-08T03:01:45+00:00</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2021-07-02T11:18:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=11941f8a85362f612df61f4aaab0e41b64d2111d'/>
<id>urn:sha1:11941f8a85362f612df61f4aaab0e41b64d2111d</id>
<content type='text'>
This change implements CPUMAP redirect support for generic XDP programs.
The idea is to reuse the cpu map entry's queue that is used to push
native xdp frames for redirecting skb to a different CPU. This will
match native XDP behavior (in that RPS is invoked again for packet
reinjected into networking stack).

To be able to determine whether the incoming skb is from the driver or
cpumap, we reuse skb-&gt;redirected bit that skips generic XDP processing
when it is set. To always make use of this, CONFIG_NET_REDIRECT guard on
it has been lifted and it is always available.

&gt;From the redirect side, we add the skb to ptr_ring with its lowest bit
set to 1.  This should be safe as skb is not 1-byte aligned. This allows
kthread to discern between xdp_frames and sk_buff. On consumption of the
ptr_ring item, the lowest bit is unset.

In the end, the skb is simply added to the list that kthread is anyway
going to maintain for xdp_frames converted to skb, and then received
again by using netif_receive_skb_list.

Bulking optimization for generic cpumap is left as an exercise for a
future patch for now.

Since cpumap entry progs are now supported, also remove check in
generic_xdp_install for the cpumap.

Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Reviewed-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Acked-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Link: https://lore.kernel.org/bpf/20210702111825.491065-4-memxor@gmail.com
</content>
</entry>
<entry>
<title>page_pool: Allow drivers to hint on SKB recycling</title>
<updated>2021-06-07T21:11:47+00:00</updated>
<author>
<name>Ilias Apalodimas</name>
<email>ilias.apalodimas@linaro.org</email>
</author>
<published>2021-06-07T19:02:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=6a5bcd84e886a9a91982e515c539529c28acdcc2'/>
<id>urn:sha1:6a5bcd84e886a9a91982e515c539529c28acdcc2</id>
<content type='text'>
Up to now several high speed NICs have custom mechanisms of recycling
the allocated memory they use for their payloads.
Our page_pool API already has recycling capabilities that are always
used when we are running in 'XDP mode'. So let's tweak the API and the
kernel network stack slightly and allow the recycling to happen even
during the standard operation.
The API doesn't take into account 'split page' policies used by those
drivers currently, but can be extended once we have users for that.

The idea is to be able to intercept the packet on skb_release_data().
If it's a buffer coming from our page_pool API recycle it back to the
pool for further usage or just release the packet entirely.

To achieve that we introduce a bit in struct sk_buff (pp_recycle:1) and
a field in struct page (page-&gt;pp) to store the page_pool pointer.
Storing the information in page-&gt;pp allows us to recycle both SKBs and
their fragments.
We could have skipped the skb bit entirely, since identical information
can bederived from struct page. However, in an effort to affect the free path
as less as possible, reading a single bit in the skb which is already
in cache, is better that trying to derive identical information for the
page stored data.

The driver or page_pool has to take care of the sync operations on it's own
during the buffer recycling since the buffer is, after opting-in to the
recycling, never unmapped.

Since the gain on the drivers depends on the architecture, we are not
enabling recycling by default if the page_pool API is used on a driver.
In order to enable recycling the driver must call skb_mark_for_recycle()
to store the information we need for recycling in page-&gt;pp and
enabling the recycling bit, or page_pool_store_mem_info() for a fragment.

Co-developed-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Co-developed-by: Matteo Croce &lt;mcroce@microsoft.com&gt;
Signed-off-by: Matteo Croce &lt;mcroce@microsoft.com&gt;
Signed-off-by: Ilias Apalodimas &lt;ilias.apalodimas@linaro.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>skbuff: add a parameter to __skb_frag_unref</title>
<updated>2021-06-07T21:11:47+00:00</updated>
<author>
<name>Matteo Croce</name>
<email>mcroce@microsoft.com</email>
</author>
<published>2021-06-07T19:02:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=c420c98982fa9e749c99e022845d5f323d098b72'/>
<id>urn:sha1:c420c98982fa9e749c99e022845d5f323d098b72</id>
<content type='text'>
This is a prerequisite patch, the next one is enabling recycling of
skbs and fragments. Add an extra argument on __skb_frag_unref() to
handle recycling, and update the current users of the function with that.

Signed-off-by: Matteo Croce &lt;mcroce@microsoft.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Introduce skb_send_sock() for sock_map</title>
<updated>2021-04-01T17:56:13+00:00</updated>
<author>
<name>Cong Wang</name>
<email>cong.wang@bytedance.com</email>
</author>
<published>2021-03-31T02:32:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/starfive-tech/linux.git/commit/?id=0739cd28f2645e814586c7536ba5da9825cb8029'/>
<id>urn:sha1:0739cd28f2645e814586c7536ba5da9825cb8029</id>
<content type='text'>
We only have skb_send_sock_locked() which requires callers
to use lock_sock(). Introduce a variant skb_send_sock()
which locks on its own, callers do not need to lock it
any more. This will save us from adding a -&gt;sendmsg_locked
for each protocol.

To reuse the code, pass function pointers to __skb_send_sock()
and build skb_send_sock() and skb_send_sock_locked() on top.

Signed-off-by: Cong Wang &lt;cong.wang@bytedance.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Reviewed-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Link: https://lore.kernel.org/bpf/20210331023237.41094-4-xiyou.wangcong@gmail.com
</content>
</entry>
</feed>
