<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/net/bridge/br_device.c, branch linux-4.13.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-4.13.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-4.13.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2017-09-01T17:11:28+00:00</updated>
<entry>
<title>bridge: switchdev: Clear forward mark when transmitting packet</title>
<updated>2017-09-01T17:11:28+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@mellanox.com</email>
</author>
<published>2017-09-01T09:22:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=79e99bdd60b484af9afe0147e85a13e66d5c1cdb'/>
<id>urn:sha1:79e99bdd60b484af9afe0147e85a13e66d5c1cdb</id>
<content type='text'>
Commit 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for
stacked devices") added the 'offload_fwd_mark' bit to the skb in order
to allow drivers to indicate to the bridge driver that they already
forwarded the packet in L2.

In case the bit is set, before transmitting the packet from each port,
the port's mark is compared with the mark stored in the skb's control
block. If both marks are equal, we know the packet arrived from a switch
device that already forwarded the packet and it's not re-transmitted.

However, if the packet is transmitted from the bridge device itself
(e.g., br0), we should clear the 'offload_fwd_mark' bit as the mark
stored in the skb's control block isn't valid.

This scenario can happen in rare cases where a packet was trapped during
L3 forwarding and forwarded by the kernel to a bridge device.

Fixes: 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for stacked devices")
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Reported-by: Yotam Gigi &lt;yotamg@mellanox.com&gt;
Tested-by: Yotam Gigi &lt;yotamg@mellanox.com&gt;
Reviewed-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Acked-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: bridge: fix dest lookup when vlan proto doesn't match</title>
<updated>2017-07-14T15:19:23+00:00</updated>
<author>
<name>Nikolay Aleksandrov</name>
<email>nikolay@cumulusnetworks.com</email>
</author>
<published>2017-07-13T13:09:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=31a4562d7408493c6377933ff2f7d7302dbdea80'/>
<id>urn:sha1:31a4562d7408493c6377933ff2f7d7302dbdea80</id>
<content type='text'>
With 802.1ad support the vlan_ingress code started checking for vlan
protocol mismatch which causes the current tag to be inserted and the
bridge vlan protocol &amp; pvid to be set. The vlan tag insertion changes
the skb mac_header and thus the lookup mac dest pointer which was loaded
prior to calling br_allowed_ingress in br_handle_frame_finish is VLAN_HLEN
bytes off now, pointing to the last two bytes of the destination mac and
the first four of the source mac causing lookups to always fail and
broadcasting all such packets to all ports. Same thing happens for locally
originated packets when passing via br_dev_xmit. So load the dest pointer
after the vlan checks and possible skb change.

Fixes: 8580e2117c06 ("bridge: Prepare for 802.1ad vlan filtering support")
Reported-by: Anitha Narasimha Murthy &lt;anitha@cumulusnetworks.com&gt;
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Acked-by: Toshiaki Makita &lt;makita.toshiaki@lab.ntt.co.jp&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Fix inconsistent teardown and release of private netdev state.</title>
<updated>2017-06-07T19:53:24+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2017-05-08T16:52:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cf124db566e6b036b8bcbe8decbed740bdfac8c6'/>
<id>urn:sha1:cf124db566e6b036b8bcbe8decbed740bdfac8c6</id>
<content type='text'>
Network devices can allocate reasources and private memory using
netdev_ops-&gt;ndo_init().  However, the release of these resources
can occur in one of two different places.

Either netdev_ops-&gt;ndo_uninit() or netdev-&gt;destructor().

The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.

netdev_ops-&gt;ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.

netdev-&gt;destructor(), on the other hand, does not run until the
netdev references all go away.

Further complicating the situation is that netdev-&gt;destructor()
almost universally does also a free_netdev().

This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.

If netdev_ops-&gt;ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops-&gt;ndo_uninit().  But
it is not able to invoke netdev-&gt;destructor().

This is because netdev-&gt;destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.

However, this means that the resources that would normally be released
by netdev-&gt;destructor() will not be.

Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.

Many drivers do not try to deal with this, and instead we have leaks.

Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev-&gt;destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().

netdev-&gt;priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev-&gt;destructor(), except for
free_netdev().

netdev-&gt;needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().

Now, register_netdevice() can sanely release all resources after
ndo_ops-&gt;ndo_init() succeeds, by invoking both ndo_ops-&gt;ndo_uninit()
and netdev-&gt;priv_destructor().

And at the end of unregister_netdevice(), we invoke
netdev-&gt;priv_destructor() and optionally call free_netdev().

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bridge: move bridge multicast cleanup to ndo_uninit</title>
<updated>2017-04-25T18:02:39+00:00</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2017-04-25T14:58:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b1b9d366028ff580e6dd80b48a69c473361456f1'/>
<id>urn:sha1:b1b9d366028ff580e6dd80b48a69c473361456f1</id>
<content type='text'>
During removing a bridge device, if the bridge is still up, a new mdb entry
still can be added in br_multicast_add_group() after all mdb entries are
removed in br_multicast_dev_del(). Like the path:

  mld_ifc_timer_expire -&gt;
    mld_sendpack -&gt; ...
      br_multicast_rcv -&gt;
        br_multicast_add_group

The new mp's timer will be set up. If the timer expires after the bridge
is freed, it may cause use-after-free panic in br_multicast_group_expired.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [&lt;ffffffffa07ed2c8&gt;] br_multicast_group_expired+0x28/0xb0 [bridge]
Call Trace:
 &lt;IRQ&gt;
 [&lt;ffffffff81094536&gt;] call_timer_fn+0x36/0x110
 [&lt;ffffffffa07ed2a0&gt;] ? br_mdb_free+0x30/0x30 [bridge]
 [&lt;ffffffff81096967&gt;] run_timer_softirq+0x237/0x340
 [&lt;ffffffff8108dcbf&gt;] __do_softirq+0xef/0x280
 [&lt;ffffffff8169889c&gt;] call_softirq+0x1c/0x30
 [&lt;ffffffff8102c275&gt;] do_softirq+0x65/0xa0
 [&lt;ffffffff8108e055&gt;] irq_exit+0x115/0x120
 [&lt;ffffffff81699515&gt;] smp_apic_timer_interrupt+0x45/0x60
 [&lt;ffffffff81697a5d&gt;] apic_timer_interrupt+0x6d/0x80

Nikolay also found it would cause a memory leak - the mdb hash is
reallocated and not freed due to the mdb rehash.

unreferenced object 0xffff8800540ba800 (size 2048):
  backtrace:
    [&lt;ffffffff816e2287&gt;] kmemleak_alloc+0x67/0xc0
    [&lt;ffffffff81260bea&gt;] __kmalloc+0x1ba/0x3e0
    [&lt;ffffffffa05c60ee&gt;] br_mdb_rehash+0x5e/0x340 [bridge]
    [&lt;ffffffffa05c74af&gt;] br_multicast_new_group+0x43f/0x6e0 [bridge]
    [&lt;ffffffffa05c7aa3&gt;] br_multicast_add_group+0x203/0x260 [bridge]
    [&lt;ffffffffa05ca4b5&gt;] br_multicast_rcv+0x945/0x11d0 [bridge]
    [&lt;ffffffffa05b6b10&gt;] br_dev_xmit+0x180/0x470 [bridge]
    [&lt;ffffffff815c781b&gt;] dev_hard_start_xmit+0xbb/0x3d0
    [&lt;ffffffff815c8743&gt;] __dev_queue_xmit+0xb13/0xc10
    [&lt;ffffffff815c8850&gt;] dev_queue_xmit+0x10/0x20
    [&lt;ffffffffa02f8d7a&gt;] ip6_finish_output2+0x5ca/0xac0 [ipv6]
    [&lt;ffffffffa02fbfc6&gt;] ip6_finish_output+0x126/0x2c0 [ipv6]
    [&lt;ffffffffa02fc245&gt;] ip6_output+0xe5/0x390 [ipv6]
    [&lt;ffffffffa032b92c&gt;] NF_HOOK.constprop.44+0x6c/0x240 [ipv6]
    [&lt;ffffffffa032bd16&gt;] mld_sendpack+0x216/0x3e0 [ipv6]
    [&lt;ffffffffa032d5eb&gt;] mld_ifc_timer_expire+0x18b/0x2b0 [ipv6]

This could happen when ip link remove a bridge or destroy a netns with a
bridge device inside.

With Nikolay's suggestion, this patch is to clean up bridge multicast in
ndo_uninit after bridge dev is shutdown, instead of br_dev_delete, so
that netif_running check in br_multicast_add_group can avoid this issue.

v1-&gt;v2:
  - fix this issue by moving br_multicast_dev_del to ndo_uninit, instead
    of calling dev_close in br_dev_delete.

(NOTE: Depends upon b6fe0440c637 ("bridge: implement missing ndo_uninit()"))

Fixes: e10177abf842 ("bridge: multicast: fix handling of temp and perm entries")
Reported-by: Jianwen Ji &lt;jiji@redhat.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Reviewed-by: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bridge: implement missing ndo_uninit()</title>
<updated>2017-04-12T02:22:44+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@mellanox.com</email>
</author>
<published>2017-04-10T11:59:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b6fe0440c63716e09cfc0d1484e3898a0f29d1d1'/>
<id>urn:sha1:b6fe0440c63716e09cfc0d1484e3898a0f29d1d1</id>
<content type='text'>
While the bridge driver implements an ndo_init(), it was missing a
symmetric ndo_uninit(), causing the different de-initialization
operations to be scattered around its dellink() and destructor().

Implement a symmetric ndo_uninit() and remove the overlapping operations
from its dellink() and destructor().

This is a prerequisite for the next patch, as it allows us to have a
proper cleanup upon changelink() failure during the bridge's newlink().

Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bridge: fdb: converge fdb searching functions into one</title>
<updated>2017-02-14T17:41:02+00:00</updated>
<author>
<name>Nikolay Aleksandrov</name>
<email>nikolay@cumulusnetworks.com</email>
</author>
<published>2017-02-13T13:59:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bfd0aeac52f74bfb44c0974131e44abb33a13e78'/>
<id>urn:sha1:bfd0aeac52f74bfb44c0974131e44abb33a13e78</id>
<content type='text'>
Before this patch we had 3 different fdb searching functions which was
confusing. This patch reduces all of them to one - fdb_find_rcu(), and
two flavors: br_fdb_find() which requires hash_lock and br_fdb_find_rcu
which requires RCU. This makes it clear what needs to be used, we also
remove two abusers of __br_fdb_get which called it under hash_lock and
replace them with br_fdb_find().

Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bridge: move to workqueue gc</title>
<updated>2017-02-07T03:53:13+00:00</updated>
<author>
<name>Nikolay Aleksandrov</name>
<email>nikolay@cumulusnetworks.com</email>
</author>
<published>2017-02-04T17:05:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f7cdee8a79a1cb03fa9ca71b825e72f880b344e1'/>
<id>urn:sha1:f7cdee8a79a1cb03fa9ca71b825e72f880b344e1</id>
<content type='text'>
Move the fdb garbage collector to a workqueue which fires at least 10
milliseconds apart and cleans chain by chain allowing for other tasks
to run in the meantime. When having thousands of fdbs the system is much
more responsive. Most importantly remove the need to check if the
matched entry has expired in __br_fdb_get that causes false-sharing and
is completely unnecessary if we cleanup entries, at worst we'll get 10ms
of traffic for that entry before it gets deleted.

Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: remove ndo_neigh_{construct, destroy} from stacked devices</title>
<updated>2017-02-06T16:25:57+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@mellanox.com</email>
</author>
<published>2017-02-06T15:20:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a8eca326151ee1beac82a4fd86d9edad3a37aaed'/>
<id>urn:sha1:a8eca326151ee1beac82a4fd86d9edad3a37aaed</id>
<content type='text'>
In commit 18bfb924f000 ("net: introduce default neigh_construct/destroy
ndo calls for L2 upper devices") we added these ndos to stacked devices
such as team and bond, so that calls will be propagated to mlxsw.

However, previous commit removed the reliance on these ndos and no new
users of these ndos have appeared since above mentioned commit. We can
therefore safely remove this dead code.

Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Signed-off-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: make ndo_get_stats64 a void function</title>
<updated>2017-01-08T22:51:44+00:00</updated>
<author>
<name>stephen hemminger</name>
<email>stephen@networkplumber.org</email>
</author>
<published>2017-01-07T03:12:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bc1f44709cf27fb2a5766cadafe7e2ad5e9cb221'/>
<id>urn:sha1:bc1f44709cf27fb2a5766cadafe7e2ad5e9cb221</id>
<content type='text'>
The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.

Fix all drivers with ndo_get_stats64 to have a void function.

Signed-off-by: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Replace &lt;asm/uaccess.h&gt; with &lt;linux/uaccess.h&gt; globally</title>
<updated>2016-12-24T19:46:01+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-12-24T19:46:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7c0f6ba682b9c7632072ffbedf8d328c8f3c42ba'/>
<id>urn:sha1:7c0f6ba682b9c7632072ffbedf8d328c8f3c42ba</id>
<content type='text'>
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*&lt;asm/uaccess.h&gt;'
  sed -i -e "s!$PATT!#include &lt;linux/uaccess.h&gt;!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
