<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/net/sch_generic.h, branch v4.4.269</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.4.269</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.4.269'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-05-10T08:25:57+00:00</updated>
<entry>
<title>net_sched: keep backlog updated with qlen</title>
<updated>2020-05-10T08:25:57+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2016-06-03T22:05:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=76684c7647f5f25c89f1dcc86f4f04a515e46580'/>
<id>urn:sha1:76684c7647f5f25c89f1dcc86f4f04a515e46580</id>
<content type='text'>
commit a27758ffaf96f89002129eedb2cc172d254099f8 upstream.

For gso_skb we only update qlen, backlog should be updated too.

Note, it is correct to just update these stats at one layer,
because the gso_skb is cached there.

Reported-by: Stas Nichiporovich &lt;stasn77@gmail.com&gt;
Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too")
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sch_netem: fix rcu splat in netem_enqueue()</title>
<updated>2019-11-06T11:09:24+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-09-24T20:11:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=09ee754c623344d6a94caf115e77ed74b4b2a702'/>
<id>urn:sha1:09ee754c623344d6a94caf115e77ed74b4b2a702</id>
<content type='text'>
commit 159d2c7d8106177bd9a986fd005a311fe0d11285 upstream.

qdisc_root() use from netem_enqueue() triggers a lockdep warning.

__dev_queue_xmit() uses rcu_read_lock_bh() which is
not equivalent to rcu_read_lock() + local_bh_disable_bh as far
as lockdep is concerned.

WARNING: suspicious RCU usage
5.3.0-rc7+ #0 Not tainted
-----------------------------
include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syz-executor427/8855:
 #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline]
 #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214
 #1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804
 #2: 00000000364bae92 (&amp;(&amp;sch-&gt;q.lock)-&gt;rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
 #2: 00000000364bae92 (&amp;(&amp;sch-&gt;q.lock)-&gt;rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline]
 #2: 00000000364bae92 (&amp;(&amp;sch-&gt;q.lock)-&gt;rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838

stack backtrace:
CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357
 qdisc_root include/net/sch_generic.h:492 [inline]
 netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479
 __dev_xmit_skb net/core/dev.c:3527 [inline]
 __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902
 neigh_hh_output include/net/neighbour.h:500 [inline]
 neigh_output include/net/neighbour.h:509 [inline]
 ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228
 __ip_finish_output net/ipv4/ip_output.c:308 [inline]
 __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290
 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417
 dst_output include/net/dst.h:436 [inline]
 ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125
 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555
 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887
 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174
 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:657
 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
 __do_sys_sendmmsg net/socket.c:2442 [inline]
 __se_sys_sendmmsg net/socket.c:2439 [inline]
 __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>net_sched: fix order of queue length updates in qdisc_replace()</title>
<updated>2017-08-30T08:19:21+00:00</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2017-08-19T12:37:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=58079f56b30227c37709e9db7e9b51204e81f4ea'/>
<id>urn:sha1:58079f56b30227c37709e9db7e9b51204e81f4ea</id>
<content type='text'>
[ Upstream commit 68a66d149a8c78ec6720f268597302883e48e9fa ]

This important to call qdisc_tree_reduce_backlog() after changing queue
length. Parent qdisc should deactivate class in -&gt;qlen_notify() called from
qdisc_tree_reduce_backlog() but this happens only if qdisc-&gt;q.qlen in zero.

Missed class deactivations leads to crashes/warnings at picking packets
from empty qdisc and corrupting state at reactivating this class in future.

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Fixes: 86a7996cc8a0 ("net_sched: introduce qdisc_replace() helper")
Acked-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net/sched: act_vlan: Push skb-&gt;data to mac_header prior calling skb_vlan_*() functions</title>
<updated>2016-11-15T06:46:37+00:00</updated>
<author>
<name>Shmulik Ladkani</name>
<email>shmulik.ladkani@gmail.com</email>
</author>
<published>2016-09-29T09:10:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9edbf4a0b60b62a1fb5f57248f6c9b9ffb30c328'/>
<id>urn:sha1:9edbf4a0b60b62a1fb5f57248f6c9b9ffb30c328</id>
<content type='text'>
[ Upstream commit f39acc84aad10710e89835c60d3b6694c43a8dd9 ]

Generic skb_vlan_push/skb_vlan_pop functions don't properly handle the
case where the input skb data pointer does not point at the mac header:

- They're doing push/pop, but fail to properly unwind data back to its
  original location.
  For example, in the skb_vlan_push case, any subsequent
  'skb_push(skb, skb-&gt;mac_len)' calls make the skb-&gt;data point 4 bytes
  BEFORE start of frame, leading to bogus frames that may be transmitted.

- They update rcsum per the added/removed 4 bytes tag.
  Alas if data is originally after the vlan/eth headers, then these
  bytes were already pulled out of the csum.

OTOH calling skb_vlan_push/skb_vlan_pop with skb-&gt;data at mac_header
present no issues.

act_vlan is the only caller to skb_vlan_*() that has skb-&gt;data pointing
at network header (upon ingress).
Other calles (ovs, bpf) already adjust skb-&gt;data at mac_header.

This patch fixes act_vlan to point to the mac_header prior calling
skb_vlan_*() functions, as other callers do.

Signed-off-by: Shmulik Ladkani &lt;shmulik.ladkani@gmail.com&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Pravin Shelar &lt;pshelar@ovn.org&gt;
Cc: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net_sched: update hierarchical backlog too</title>
<updated>2016-05-19T00:06:39+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2016-02-25T22:55:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ca375cf34a7186f5b8817082d2b594dcd8cedc8b'/>
<id>urn:sha1:ca375cf34a7186f5b8817082d2b594dcd8cedc8b</id>
<content type='text'>
[ Upstream commit 2ccccf5fb43ff62b2b96cc58d95fc0b3596516e4 ]

When the bottom qdisc decides to, for example, drop some packet,
it calls qdisc_tree_decrease_qlen() to update the queue length
for all its ancestors, we need to update the backlog too to
keep the stats on root qdisc accurate.

Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Acked-by: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net_sched: introduce qdisc_replace() helper</title>
<updated>2016-05-19T00:06:39+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2016-02-25T22:55:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1188e1403a772d9698c06bcc53c44783536ff09e'/>
<id>urn:sha1:1188e1403a772d9698c06bcc53c44783536ff09e</id>
<content type='text'>
[ Upstream commit 86a7996cc8a078793670d82ed97d5a99bb4e8496 ]

Remove nearly duplicated code and prepare for the following patch.

Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Acked-by: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net_sched: fix qdisc_tree_decrease_qlen() races</title>
<updated>2015-12-03T19:59:05+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-12-02T04:08:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4eaf3b84f2881c9c028f1d5e76c52ab575fe3a66'/>
<id>urn:sha1:4eaf3b84f2881c9c028f1d5e76c52ab575fe3a66</id>
<content type='text'>
qdisc_tree_decrease_qlen() suffers from two problems on multiqueue
devices.

One problem is that it updates sch-&gt;q.qlen and sch-&gt;qstats.drops
on the mq/mqprio root qdisc, while it should not : Daniele
reported underflows errors :
[  681.774821] PAX: sch-&gt;q.qlen: 0 n: 1
[  681.774825] PAX: size overflow detected in function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 cicus.693_49 min, count: 72, decl: qlen; num: 0; context: sk_buff_head;
[  681.774954] CPU: 2 PID: 19 Comm: ksoftirqd/2 Tainted: G           O    4.2.6.201511282239-1-grsec #1
[  681.774955] Hardware name: ASUSTeK COMPUTER INC. X302LJ/X302LJ, BIOS X302LJ.202 03/05/2015
[  681.774956]  ffffffffa9a04863 0000000000000000 0000000000000000 ffffffffa990ff7c
[  681.774959]  ffffc90000d3bc38 ffffffffa95d2810 0000000000000007 ffffffffa991002b
[  681.774960]  ffffc90000d3bc68 ffffffffa91a44f4 0000000000000001 0000000000000001
[  681.774962] Call Trace:
[  681.774967]  [&lt;ffffffffa95d2810&gt;] dump_stack+0x4c/0x7f
[  681.774970]  [&lt;ffffffffa91a44f4&gt;] report_size_overflow+0x34/0x50
[  681.774972]  [&lt;ffffffffa94d17e2&gt;] qdisc_tree_decrease_qlen+0x152/0x160
[  681.774976]  [&lt;ffffffffc02694b1&gt;] fq_codel_dequeue+0x7b1/0x820 [sch_fq_codel]
[  681.774978]  [&lt;ffffffffc02680a0&gt;] ? qdisc_peek_dequeued+0xa0/0xa0 [sch_fq_codel]
[  681.774980]  [&lt;ffffffffa94cd92d&gt;] __qdisc_run+0x4d/0x1d0
[  681.774983]  [&lt;ffffffffa949b2b2&gt;] net_tx_action+0xc2/0x160
[  681.774985]  [&lt;ffffffffa90664c1&gt;] __do_softirq+0xf1/0x200
[  681.774987]  [&lt;ffffffffa90665ee&gt;] run_ksoftirqd+0x1e/0x30
[  681.774989]  [&lt;ffffffffa90896b0&gt;] smpboot_thread_fn+0x150/0x260
[  681.774991]  [&lt;ffffffffa9089560&gt;] ? sort_range+0x40/0x40
[  681.774992]  [&lt;ffffffffa9085fe4&gt;] kthread+0xe4/0x100
[  681.774994]  [&lt;ffffffffa9085f00&gt;] ? kthread_worker_fn+0x170/0x170
[  681.774995]  [&lt;ffffffffa95d8d1e&gt;] ret_from_fork+0x3e/0x70

mq/mqprio have their own ways to report qlen/drops by folding stats on
all their queues, with appropriate locking.

A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup()
without proper locking : concurrent qdisc updates could corrupt the list
that qdisc_match_from_root() parses to find a qdisc given its handle.

Fix first problem adding a TCQ_F_NOPARENT qdisc flag that
qdisc_tree_decrease_qlen() can use to abort its tree traversal,
as soon as it meets a mq/mqprio qdisc children.

Second problem can be fixed by RCU protection.
Qdisc are already freed after RCU grace period, so qdisc_list_add() and
qdisc_list_del() simply have to use appropriate rcu list variants.

A future patch will add a per struct netdev_queue list anchor, so that
qdisc_tree_decrease_qlen() can have more efficient lookups.

Reported-by: Daniele Fucini &lt;dfucini@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Cong Wang &lt;cwang@twopensource.com&gt;
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: add bpf_redirect() helper</title>
<updated>2015-09-18T04:09:07+00:00</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-09-16T06:05:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=27b29f63058d26c6c1742f1993338280d5a41dc6'/>
<id>urn:sha1:27b29f63058d26c6c1742f1993338280d5a41dc6</id>
<content type='text'>
Existing bpf_clone_redirect() helper clones skb before redirecting
it to RX or TX of destination netdev.
Introduce bpf_redirect() helper that does that without cloning.

Benchmarked with two hosts using 10G ixgbe NICs.
One host is doing line rate pktgen.
Another host is configured as:
$ tc qdisc add dev $dev ingress
$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop
so it receives the packet on $dev and immediately xmits it on $dev + 1
The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program
that does bpf_clone_redirect() and performance is 2.0 Mpps

$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section redirect_xmit drop
which is using bpf_redirect() - 2.4 Mpps

and using cls_bpf with integrated actions as:
$ tc filter add dev $dev root pref 10 \
  bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1
performance is 2.5 Mpps

To summarize:
u32+act_bpf using clone_redirect - 2.0 Mpps
u32+act_bpf using redirect - 2.4 Mpps
cls_bpf using redirect - 2.5 Mpps

For comparison linux bridge in this setup is doing 2.1 Mpps
and ixgbe rx + drop in ip_rcv - 7.8 Mpps

Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: John Fastabend &lt;john.r.fastabend@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>cls_bpf: introduce integrated actions</title>
<updated>2015-09-18T04:09:06+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-09-16T06:05:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=045efa82ff563cd4e656ca1c2e354fa5bf6bbda4'/>
<id>urn:sha1:045efa82ff563cd4e656ca1c2e354fa5bf6bbda4</id>
<content type='text'>
Often cls_bpf classifier is used with single action drop attached.
Optimize this use case and let cls_bpf return both classid and action.
For backwards compatibility reasons enable this feature under
TCA_BPF_FLAG_ACT_DIRECT flag.

Then more interesting programs like the following are easier to write:
int cls_bpf_prog(struct __sk_buff *skb)
{
  /* classify arp, ip, ipv6 into different traffic classes
   * and drop all other packets
   */
  switch (skb-&gt;protocol) {
  case htons(ETH_P_ARP):
    skb-&gt;tc_classid = 1;
    break;
  case htons(ETH_P_IP):
    skb-&gt;tc_classid = 2;
    break;
  case htons(ETH_P_IPV6):
    skb-&gt;tc_classid = 3;
    break;
  default:
    return TC_ACT_SHOT;
  }

  return TC_ACT_OK;
}

Joint work with Daniel Borkmann.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: register noqueue qdisc</title>
<updated>2015-08-28T00:14:30+00:00</updated>
<author>
<name>Phil Sutter</name>
<email>phil@nwl.cc</email>
</author>
<published>2015-08-27T19:21:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d66d6c3152e8d5a6db42a56bf7ae1c6cae87ba48'/>
<id>urn:sha1:d66d6c3152e8d5a6db42a56bf7ae1c6cae87ba48</id>
<content type='text'>
This way users can attach noqueue just like any other qdisc using tc
without having to mess with tx_queue_len first.

Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
