summaryrefslogtreecommitdiff
path: root/net/bridge/br_input.c
AgeCommit message (Collapse)AuthorFilesLines
2022-05-19net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.Andrew Lunn1-0/+7
It is possible to stack bridges on top of each other. Consider the following which makes use of an Ethernet switch: br1 / \ / \ / \ br0.11 wlan0 | br0 / | \ p1 p2 p3 br0 is offloaded to the switch. Above br0 is a vlan interface, for vlan 11. This vlan interface is then a slave of br1. br1 also has a wireless interface as a slave. This setup trunks wireless lan traffic over the copper network inside a VLAN. A frame received on p1 which is passed up to the bridge has the skb->offload_fwd_mark flag set to true, indicating that the switch has dealt with forwarding the frame out ports p2 and p3 as needed. This flag instructs the software bridge it does not need to pass the frame back down again. However, the flag is not getting reset when the frame is passed upwards. As a result br1 sees the flag, wrongly interprets it, and fails to forward the frame to wlan0. When passing a frame upwards, clear the flag. This is the Rx equivalent of br_switchdev_frame_unmark() in br_dev_xmit(). Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20220518005840.771575-1-andrew@lunn.ch Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-18net: bridge: mst: Multiple Spanning Tree (MST) modeTobias Waldekranz1-2/+15
Allow the user to switch from the current per-VLAN STP mode to an MST mode. Up to this point, per-VLAN STP states where always isolated from each other. This is in contrast to the MSTP standard (802.1Q-2018, Clause 13.5), where VLANs are grouped into MST instances (MSTIs), and the state is managed on a per-MSTI level, rather that at the per-VLAN level. Perhaps due to the prevalence of the standard, many switching ASICs are built after the same model. Therefore, add a corresponding MST mode to the bridge, which we can later add offloading support for in a straight-forward way. For now, all VLANs are fixed to MSTI 0, also called the Common Spanning Tree (CST). That is, all VLANs will follow the port-global state. Upcoming changes will make this actually useful by allowing VLANs to be mapped to arbitrary MSTIs and allow individual MSTI states to be changed. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-23net: bridge: Add support for bridge port in locked modeHans Schultz1-1/+10
In a 802.1X scenario, clients connected to a bridge port shall not be allowed to have traffic forwarded until fully authenticated. A static fdb entry of the clients MAC address for the bridge port unlocks the client and allows bidirectional communication. This scenario is facilitated with setting the bridge port in locked mode, which is also supported by various switchcore chipsets. Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25net: bridge: change return type of br_handle_ingress_vlan_tunnelKangmin Park1-5/+2
br_handle_ingress_vlan_tunnel() is only referenced in br_handle_frame(). If br_handle_ingress_vlan_tunnel() is called and return non-zero value, goto drop in br_handle_frame(). But, br_handle_ingress_vlan_tunnel() always return 0. So, the routines that check the return value and goto drop has no meaning. Therefore, change return type of br_handle_ingress_vlan_tunnel() to void and remove if statement of br_handle_frame(). Signed-off-by: Kangmin Park <l4stpr0gr4m@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20210823102118.17966-1-l4stpr0gr4m@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-20net: bridge: add vlan mcast snooping knobNikolay Aleksandrov1-2/+3
Add a global knob that controls if vlan multicast snooping is enabled. The proper contexts (vlan or bridge-wide) will be chosen based on the knob when processing packets and changing bridge device state. Note that vlans have their individual mcast snooping enabled by default, but this knob is needed to turn on bridge vlan snooping. It is disabled by default. To enable the knob vlan filtering must also be enabled, it doesn't make sense to have vlan mcast snooping without vlan filtering since that would lead to inconsistencies. Disabling vlan filtering will also automatically disable vlan mcast snooping. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20net: bridge: multicast: use multicast contexts instead of bridge or portNikolay Aleksandrov1-5/+9
Pass multicast context pointers to multicast functions instead of bridge/port. This would make it easier later to switch these contexts to their per-vlan versions. The patch is basically search and replace, no functional changes. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-14net: bridge: mcast: prepare is-router function for mcast router splitLinus Lüssing1-1/+1
In preparation for the upcoming split of multicast router state into their IPv4 and IPv6 variants make br_multicast_is_router() protocol family aware. Note that for now br_ip6_multicast_is_router() uses the currently still common ip4_mc_router_timer for now. It will be renamed to ip6_mc_router_timer later when the split is performed. While at it also renames the "1" and "2" constants in br_multicast_is_router() to the MDB_RTR_TYPE_TEMP_QUERY and MDB_RTR_TYPE_PERM enums. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-10net: bridge: Fix fall-through warnings for ClangGustavo A. R. Silva1-0/+1
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by explicitly adding a break statement instead of letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-01-10net: bridge: fix misspellings using codespell toolMenglong Dong1-1/+1
Some typos are found out by codespell tool: $ codespell ./net/bridge/ ./net/bridge/br_stp.c:604: permanant ==> permanent ./net/bridge/br_stp.c:605: persistance ==> persistence ./net/bridge/br.c:125: underlaying ==> underlying ./net/bridge/br_input.c:43: modue ==> mode ./net/bridge/br_mrp.c:828: Determin ==> Determine ./net/bridge/br_mrp.c:848: Determin ==> Determine ./net/bridge/br_mrp.c:897: Determin ==> Determine Fix typos found by codespell. Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210108025332.52480-1-dong.menglong@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-22net: bridge: switch to net core statistics counters handlingHeiner Kallweit1-5/+1
Use netdev->tstats instead of a member of net_bridge for storing a pointer to the per-cpu counters. This allows us to use core functionality for statistics handling. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/9bad2be2-fd84-7c6e-912f-cee433787018@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-31net: bridge: mcast: add support for raw L2 multicast groupsNikolay Aleksandrov1-1/+1
Extend the bridge multicast control and data path to configure routes for L2 (non-IP) multicast groups. The uapi struct br_mdb_entry union u is extended with another variant, mac_addr, which does not change the structure size, and which is valid when the proto field is zero. To be compatible with the forwarding code that is already in place, which acts as an IGMP/MLD snooping bridge with querier capabilities, we need to declare that for L2 MDB entries (for which there exists no such thing as IGMP/MLD snooping/querying), that there is always a querier. Otherwise, these entries would be flooded to all bridge ports and not just to those that are members of the L2 multicast group. Needless to say, only permanent L2 multicast groups can be installed on a bridge port. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20201028233831.610076-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-30net: bridge: extend the process of special framesHenrik Bjoernlund1-1/+32
This patch extends the processing of frames in the bridge. Currently MRP frames needs special processing and the current implementation doesn't allow a nice way to process different frame types. Therefore try to improve this by adding a list that contains frame types that need special processing. This list is iterated for each input frame and if there is a match based on frame type then these functions will be called and decide what to do with the frame. It can process the frame then the bridge doesn't need to do anything or don't process so then the bridge will do normal forwarding. Signed-off-by: Henrik Bjoernlund <henrik.bjoernlund@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-11net: bridge: allow enslaving some DSA master network devicesVladimir Oltean1-1/+22
Commit 8db0a2ee2c63 ("net: bridge: reject DSA-enabled master netdevices as bridge members") added a special check in br_if.c in order to check for a DSA master network device with a tagging protocol configured. This was done because back then, such devices, once enslaved in a bridge would become inoperative and would not pass DSA tagged traffic anymore due to br_handle_frame returning RX_HANDLER_CONSUMED. But right now we have valid use cases which do require bridging of DSA masters. One such example is when the DSA master ports are DSA switch ports themselves (in a disjoint tree setup). This should be completely equivalent, functionally speaking, from having multiple DSA switches hanging off of the ports of a switchdev driver. So we should allow the enslaving of DSA tagged master network devices. Instead of the regular br_handle_frame(), install a new function br_handle_frame_dummy() on these DSA masters, which returns RX_HANDLER_PASS in order to call into the DSA specific tagging protocol handlers, and lift the restriction from br_add_if. Suggested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-04-27bridge: mrp: Integrate MRP into the bridgeHoratiu Vultur1-0/+3
To integrate MRP into the bridge, the bridge needs to do the following: - detect if the MRP frame was received on MRP ring port in that case it would be processed otherwise just forward it as usual. - enable parsing of MRP - before whenever the bridge was set up, it would set all the ports in forwarding state. Add an extra check to not set ports in forwarding state if the port is an MRP ring port. The reason of this change is that if the MRP instance initially sets the port in blocked state by setting the bridge up it would overwrite this setting. Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24net: bridge: vlan: add per-vlan stateNikolay Aleksandrov1-2/+5
The first per-vlan option added is state, it is needed for EVPN and for per-vlan STP. The state allows to control the forwarding on per-vlan basis. The vlan state is considered only if the port state is forwarding in order to avoid conflicts and be consistent. br_allowed_egress is called only when the state is forwarding, but the ingress case is a bit more complicated due to the fact that we may have the transition between port:BR_STATE_FORWARDING -> vlan:BR_STATE_LEARNING which should still allow the bridge to learn from the packet after vlan filtering and it will be dropped after that. Also to optimize the pvid state check we keep a copy in the vlan group to avoid one lookup. The state members are modified with *_ONCE() to annotate the lockless access. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-04net: bridge: fdb: eliminate extra port state tests from fast-pathNikolay Aleksandrov1-0/+1
When commit df1c0b8468b3 ("[BRIDGE]: Packets leaking out of disabled/blocked ports.") introduced the port state tests in br_fdb_update() it was to avoid learning/refreshing from STP BPDUs, it was also used to avoid learning/refreshing from user-space with NTF_USE. Those two tests are done for every packet entering the bridge if it's learning, but for the fast-path we already have them checked in br_handle_frame() and is unnecessary to do it again. Thus push the checks to the unlikely cases and drop them from br_fdb_update(), the new nbp_state_should_learn() helper is used to determine if the port state allows br_fdb_update() to be called. The two places which need to do it manually are: - user-space add call with NTF_USE set - link-local packet learning done in __br_handle_local_finish() Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01net: bridge: fdb: br_fdb_update can take flags directlyNikolay Aleksandrov1-2/+2
If we modify br_fdb_update() to take flags directly we can get rid of one test and one atomic bitop in the learning path. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-30net: bridge: fdb: convert is_local to bitopsNikolay Aleksandrov1-1/+1
The patch adds a new fdb flags field in the hole between the two cache lines and uses it to convert is_local to bitops. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-5/+3
Two cases of overlapping changes, nothing fancy. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-04netfilter: nf_queue: remove unused hook entries pointerFlorian Westphal1-1/+1
Its not used anywhere, so remove this. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-02net: bridge: don't cache ether dest pointer on inputNikolay Aleksandrov1-5/+3
We would cache ether dst pointer on input in br_handle_frame_finish but after the neigh suppress code that could lead to a stale pointer since both ipv4 and ipv6 suppress code do pskb_may_pull. This means we have to always reload it after the suppress code so there's no point in having it cached just retrieve it directly. Fixes: 057658cb33fbf ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports") Fixes: ed842faeb2bd ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-9/+14
Conflict resolution of af_smc.c from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17net: bridge: fix per-port af_packet socketsNikolay Aleksandrov1-9/+14
When the commit below was introduced it changed two visible things: - the skb was no longer passed through the protocol handlers with the original device - the skb was passed up the stack with skb->dev = bridge The first change broke af_packet sockets on bridge ports. For example we use them for hostapd which listens for ETH_P_PAE packets on the ports. We discussed two possible fixes: - create a clone and pass it through NF_HOOK(), act on the original skb based on the result - somehow signal to the caller from the okfn() that it was called, meaning the skb is ok to be passed, which this patch is trying to implement via returning 1 from the bridge link-local okfn() Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and drop/error would return < 0 thus the okfn() is called only when the return was 1, so we signal to the caller that it was called by preserving the return value from nf_hook(). Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-15bridge: only include nf_queue.h if neededStephen Rothwell1-0/+2
After merging the netfilter-next tree, today's linux-next build (powerpc ppc44x_defconfig) failed like this: In file included from net/bridge/br_input.c:19: include/net/netfilter/nf_queue.h:16:23: error: field 'state' has incomplete type struct nf_hook_state state; ^~~~~ Fixes: 971502d77faa ("bridge: netfilter: unroll NF_HOOK helper in bridge input path") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-12bridge: broute: make broute a real ebtables tableFlorian Westphal1-14/+4
This makes broute a normal ebtables table, hooking at PREROUTING. The broute hook is removed. It uses skb->cb to signal to bridge rx handler that the skb should be routed instead of being bridged. This change is backwards compatible with ebtables as no userspace visible parts are changed. This means we can also remove the !ops test in ebt_register_table, it was only there for broute table sake. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-12bridge: netfilter: unroll NF_HOOK helper in bridge input pathFlorian Westphal1-4/+51
Replace NF_HOOK() based invocation of the netfilter hooks with a private copy of nf_hook_slow(). This copy has one difference: it can return the rx handler value expected by the stack, i.e. RX_HANDLER_CONSUMED or RX_HANDLER_PASS. This is needed by the next patch to invoke the ebtables "broute" table via the standard netfilter hooks rather than the custom "br_should_route_hook" indirection that is used now. When the skb is to be "brouted", we must return RX_HANDLER_PASS from the bridge rx input handler, but there is no way to indicate this via NF_HOOK(), unless perhaps by some hack such as exposing bridge_cb in the netfilter core or a percpu flag. text data bss dec filename 3369 56 0 3425 net/bridge/br_input.o.before 3458 40 0 3498 net/bridge/br_input.o.after This allows removal of the "br_should_route_hook" in the next patch. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-12bridge: reduce size of input cb to 16 bytesFlorian Westphal1-0/+2
Reduce size of br_input_skb_cb from 24 to 16 bytes by using bitfield for those values that can only be 0 or 1. igmp is the igmp type value, so it needs to be at least u8. Furthermore, the bridge currently relies on step-by-step initialization of br_input_skb_cb fields as the skb passes through the stack. Explicitly zero out the bridge input cb instead, this avoids having to review/validate that no BR_INPUT_SKB_CB(skb)->foo test can see a 'random' value from previous protocol cb. AFAICS all current fields are always set up before they are read again, so this is not a bug fix. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-28net: bridge: add no_linklocal_learn bool optionNikolay Aleksandrov1-1/+3
Use the new boolopt API to add an option which disables learning from link-local packets. The default is kept as before and learning is enabled. This is a simple map from a boolopt bit to a bridge private flag that is tested before learning. v2: pass NULL for extack via sysfs Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-26net: bridge: convert neigh_suppress_enabled option to a bitNikolay Aleksandrov1-1/+1
Convert the neigh_suppress_enabled option to a bit. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-25net: bridge: add support for port isolationNikolay Aleksandrov1-0/+1
This patch adds support for a new port flag - BR_ISOLATED. If it is set then isolated ports cannot communicate between each other, but they can still communicate with non-isolated ports. The same can be achieved via ACLs but they can't scale with large number of ports and also the complexity of the rules grows. This feature can be used to achieve isolated vlan functionality (similar to pvlan) as well, though currently it will be port-wide (for all vlans on the port). The new test in should_deliver uses data that is already cache hot and the new boolean is used to avoid an additional source port test in should_deliver. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-10net: bridge: Rename mglist to host_joinedAndrew Lunn1-1/+1
The boolean mglist indicates the host has joined a particular multicast group on the bridge interface. It is badly named, obscuring what is means. Rename it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09bridge: suppress nd pkts on BR_NEIGH_SUPPRESS portsRoopa Prabhu1-0/+11
This patch avoids flooding and proxies ndisc packets for BR_NEIGH_SUPPRESS ports. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09bridge: suppress arp pkts on BR_NEIGH_SUPPRESS portsRoopa Prabhu1-58/+5
This patch avoids flooding and proxies arp packets for BR_NEIGH_SUPPRESS ports. Moves existing br_do_proxy_arp to br_do_proxy_suppress_arp to support both proxy arp and neigh suppress. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29net: bridge: add per-port group_fwd_mask with less restrictionsNikolay Aleksandrov1-0/+1
We need to be able to transparently forward most link-local frames via tunnels (e.g. vxlan, qinq). Currently the bridge's group_fwd_mask has a mask which restricts the forwarding of STP and LACP, but we need to be able to forward these over tunnels and control that forwarding on a per-port basis thus add a new per-port group_fwd_mask option which only disallows mac pause frames to be forwarded (they're always dropped anyway). The patch does not change the current default situation - all of the others are still restricted unless configured for forwarding. We have successfully tested this patch with LACP and STP forwarding over VxLAN and qinq tunnels. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-14net: bridge: fix dest lookup when vlan proto doesn't matchNikolay Aleksandrov1-1/+2
With 802.1ad support the vlan_ingress code started checking for vlan protocol mismatch which causes the current tag to be inserted and the bridge vlan protocol & pvid to be set. The vlan tag insertion changes the skb mac_header and thus the lookup mac dest pointer which was loaded prior to calling br_allowed_ingress in br_handle_frame_finish is VLAN_HLEN bytes off now, pointing to the last two bytes of the destination mac and the first four of the source mac causing lookups to always fail and broadcasting all such packets to all ports. Same thing happens for locally originated packets when passing via br_dev_xmit. So load the dest pointer after the vlan checks and possible skb change. Fixes: 8580e2117c06 ("bridge: Prepare for 802.1ad vlan filtering support") Reported-by: Anitha Narasimha Murthy <anitha@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-13bridge: drop netfilter fake rtable unconditionallyFlorian Westphal1-0/+1
Andreas reports kernel oops during rmmod of the br_netfilter module. Hannes debugged the oops down to a NULL rt6info->rt6i_indev. Problem is that br_netfilter has the nasty concept of adding a fake rtable to skb->dst; this happens in a br_netfilter prerouting hook. A second hook (in bridge LOCAL_IN) is supposed to remove these again before the skb is handed up the stack. However, on module unload hooks get unregistered which means an skb could traverse the prerouting hook that attaches the fake_rtable, while the 'fake rtable remove' hook gets removed from the hooklist immediately after. Fixes: 34666d467cbf1e2e3c7 ("netfilter: bridge: move br_netfilter out of the core") Reported-by: Andreas Karis <akaris@redhat.com> Debugged-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14bridge: fdb: converge fdb searching functions into oneNikolay Aleksandrov1-2/+2
Before this patch we had 3 different fdb searching functions which was confusing. This patch reduces all of them to one - fdb_find_rcu(), and two flavors: br_fdb_find() which requires hash_lock and br_fdb_find_rcu which requires RCU. This makes it clear what needs to be used, we also remove two abusers of __br_fdb_get which called it under hash_lock and replace them with br_fdb_find(). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07bridge: avoid unnecessary read of jiffiesstephen hemminger1-2/+4
Jiffies is volatile so read it once. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07bridge: fdb: write to used and updated at most once per jiffyNikolay Aleksandrov1-1/+2
Writing once per jiffy is enough to limit the bridge's false sharing. After this change the bridge doesn't show up in the local load HitM stats. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-03bridge: vlan dst_metadata hooks in ingress and egress pathsRoopa Prabhu1-1/+7
- ingress hook: - if port is a tunnel port, use tunnel info in attached dst_metadata to map it to a local vlan - egress hook: - if port is a tunnel port, use tunnel info attached to vlan to set dst_metadata on the skb CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-5/+2
Conflicts: drivers/net/ethernet/mediatek/mtk_eth_soc.c drivers/net/ethernet/qlogic/qed/qed_dcbx.c drivers/net/phy/Kconfig All conflicts were cases of overlapping commits. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-02net: bridge: change unicast boolean to exact pkt_typeNikolay Aleksandrov1-15/+25
Remove the unicast flag and introduce an exact pkt_type. That would help us for the upcoming per-port multicast flood flag and also slightly reduce the tests in the input fast path. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-02net: bridge: don't increment tx_dropped in br_do_proxy_arpNikolay Aleksandrov1-5/+2
pskb_may_pull may fail due to various reasons (e.g. alloc failure), but the skb isn't changed/dropped and processing continues so we shouldn't increment tx_dropped. CC: Kyeyoon Park <kyeyoonp@codeaurora.org> CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: bridge@lists.linux-foundation.org Fixes: 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-26bridge: switchdev: Add forward mark support for stacked devicesIdo Schimmel1-0/+2
switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of port netdevs so that packets being flooded by the device won't be flooded twice. It works by assigning a unique identifier (the ifindex of the first bridge port) to bridge ports sharing the same parent ID. This prevents packets from being flooded twice by the same switch, but will flood packets through bridge ports belonging to a different switch. This method is problematic when stacked devices are taken into account, such as VLANs. In such cases, a physical port netdev can have upper devices being members in two different bridges, thus requiring two different 'offload_fwd_mark's to be configured on the port netdev, which is impossible. The main problem is that packet and netdev marking is performed at the physical netdev level, whereas flooding occurs between bridge ports, which are not necessarily port netdevs. Instead, packet and netdev marking should really be done in the bridge driver with the switch driver only telling it which packets it already forwarded. The bridge driver will mark such packets using the mark assigned to the ingress bridge port and will prevent the packet from being forwarded through any bridge port sharing the same mark (i.e. having the same parent ID). Remove the current switchdev 'offload_fwd_mark' implementation and instead implement the proposed method. In addition, make rocker - the sole user of the mark - use the proposed method. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-25bridge: Fix incorrect re-injection of LLDP packetsIdo Schimmel1-0/+8
Commit 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") caused LLDP packets arriving through a bridge port to be re-injected to the Rx path with skb->dev set to the bridge device, but this breaks the lldpad daemon. The lldpad daemon opens a packet socket with protocol set to ETH_P_LLDP for any valid device on the system, which doesn't not include soft devices such as bridge and VLAN. Since packet sockets (ptype_base) are processed in the Rx path after the Rx handler, LLDP packets with skb->dev set to the bridge device never reach the lldpad daemon. Fix this by making the bridge's Rx handler re-inject LLDP packets with RX_HANDLER_PASS, which effectively restores the behaviour prior to the mentioned commit. This means netfilter will never receive LLDP packets coming through a bridge port, as I don't see a way in which we can have okfn() consume the packet without breaking existing behaviour. I've already carried out a similar fix for STP packets in commit 56fae404fb2c ("bridge: Fix incorrect re-injection of STP packets"). Fixes: 8626c56c8279 ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Cc: Florian Westphal <fw@strlen.de> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-17net: bridge: remove _deliver functions and consolidate forward codeNikolay Aleksandrov1-3/+3
Before this patch we had two flavors of most forwarding functions - _forward and _deliver, the difference being that the latter are used when the packets are locally originated. Instead of all this function pointer passing and code duplication, we can just pass a boolean noting that the packet was locally originated and use that to perform the necessary checks in __br_forward. This gives a minor performance improvement but more importantly consolidates the forwarding paths. Also add a kernel doc comment to explain the exported br_forward()'s arguments. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-17net: bridge: drop skb2/skb0 variables and use a local_rcv booleanNikolay Aleksandrov1-15/+10
Currently if the packet is going to be received locally we set skb0 or sometimes called skb2 variables to the original skb. This can get confusing and also we can avoid one conditional on the fast path by simply using a boolean and passing it around. Thanks to Roopa for the name suggestion. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-17net: bridge: rearrange flood vs unicast receive pathsNikolay Aleksandrov1-15/+14
This patch removes one conditional from the unicast path by using the fact that skb is NULL only when the packet is multicast or is local. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-17net: bridge: minor style adjustments in br_handle_frame_finishNikolay Aleksandrov1-10/+8
Trivial style changes in br_handle_frame_finish. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>