274 files changed, 10067 insertions, 3215 deletions
diff --git a/Documentation/bpf/bpf_devel_QA.txt b/Documentation/bpf/bpf_devel_QA.txt
new file mode 100644
index 000000000000..cefef855dea4
--- /dev/null
+++ b/Documentation/bpf/bpf_devel_QA.txt
@@ -0,0 +1,519 @@
+This document provides information for the BPF subsystem about various
+workflows related to reporting bugs, submitting patches, and queueing
+patches for stable kernels.
+
+For general information about submitting patches, please refer to
+Documentation/process/. This document only describes additional specifics
+related to BPF.
+
+Reporting bugs:
+---------------
+
+Q: How do I report bugs for BPF kernel code?
+
+A: Since all BPF kernel development as well as bpftool and iproute2 BPF
+   loader development happens through the netdev kernel mailing list,
+   please report any found issues around BPF to the following mailing
+   list:
+
+     netdev@vger.kernel.org
+
+   This may also include issues related to XDP, BPF tracing, etc.
+
+   Given netdev has a high volume of traffic, please also add the BPF
+   maintainers to Cc (from kernel MAINTAINERS file):
+
+     Alexei Starovoitov <ast@kernel.org>
+     Daniel Borkmann <daniel@iogearbox.net>
+
+   In case a buggy commit has already been identified, make sure to keep
+   the actual commit authors in Cc as well for the report. They can
+   typically be identified through the kernel's git tree.
+
+   Please do *not* report BPF issues to bugzilla.kernel.org since it
+   is a guarantee that the reported issue will be overlooked.
+
+Submitting patches:
+-------------------
+
+Q: To which mailing list do I need to submit my BPF patches?
+
+A: Please submit your BPF patches to the netdev kernel mailing list:
+
+     netdev@vger.kernel.org
+
+   Historically, BPF came out of networking and has always been maintained
+   by the kernel networking community. Although these days BPF touches
+   many other subsystems as well, the patches are still routed mainly
+   through the networking community.
+
+   In case your patch has changes in various different subsystems (e.g.
+   tracing, security, etc), make sure to Cc the related kernel mailing
+   lists and maintainers from there as well, so they are able to review
+   the changes and provide their Acked-by's to the patches.
+
+Q: Where can I find patches currently under discussion for BPF subsystem?
+
+A: All patches that are Cc'ed to netdev are queued for review under netdev
+   patchwork project:
+
+     http://patchwork.ozlabs.org/project/netdev/list/
+
+   Those patches which target BPF, are assigned to a 'bpf' delegate for
+   further processing from BPF maintainers. The current queue with
+   patches under review can be found at:
+
+     https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
+
+   Once the patches have been reviewed by the BPF community as a whole
+   and approved by the BPF maintainers, their status in patchwork will be
+   changed to 'Accepted' and the submitter will be notified by mail. This
+   means that the patches look good from a BPF perspective and have been
+   applied to one of the two BPF kernel trees.
+
+   In case feedback from the community requires a respin of the patches,
+   their status in patchwork will be set to 'Changes Requested', and purged
+   from the current review queue. Likewise for cases where patches would
+   get rejected or are not applicable to the BPF trees (but assigned to
+   the 'bpf' delegate).
+
+Q: How do the changes make their way into Linux?
+
+A: There are two BPF kernel trees (git repositories). Once patches have
+   been accepted by the BPF maintainers, they will be applied to one
+   of the two BPF trees:
+
+     https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
+     https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
+
+   The bpf tree itself is for fixes only, whereas bpf-next for features,
+   cleanups or other kind of improvements ("next-like" content). This is
+   analogous to net and net-next trees for networking. Both bpf and
+   bpf-next will only have a master branch in order to simplify against
+   which branch patches should get rebased to.
+
+   Accumulated BPF patches in the bpf tree will regularly get pulled
+   into the net kernel tree. Likewise, accumulated BPF patches accepted
+   into the bpf-next tree will make their way into net-next tree. net and
+   net-next are both run by David S. Miller. From there, they will go
+   into the kernel mainline tree run by Linus Torvalds. To read up on the
+   process of net and net-next being merged into the mainline tree, see
+   the netdev FAQ under:
+
+     Documentation/networking/netdev-FAQ.txt
+
+   Occasionally, to prevent merge conflicts, we might send pull requests
+   to other trees (e.g. tracing) with a small subset of the patches, but
+   net and net-next are always the main trees targeted for integration.
+
+   The pull requests will contain a high-level summary of the accumulated
+   patches and can be searched on netdev kernel mailing list through the
+   following subject lines (yyyy-mm-dd is the date of the pull request):
+
+     pull-request: bpf yyyy-mm-dd
+     pull-request: bpf-next yyyy-mm-dd
+
+Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be
+   applied to?
+
+A: The process is the very same as described in the netdev FAQ, so
+   please read up on it. The subject line must indicate whether the
+   patch is a fix or rather "next-like" content in order to let the
+   maintainers know whether it is targeted at bpf or bpf-next.
+
+   For fixes eventually landing in bpf -> net tree, the subject must
+   look like:
+
+     git format-patch --subject-prefix='PATCH bpf' start..finish
+
+   For features/improvements/etc that should eventually land in
+   bpf-next -> net-next, the subject must look like:
+
+     git format-patch --subject-prefix='PATCH bpf-next' start..finish
+
+   If unsure whether the patch or patch series should go into bpf
+   or net directly, or bpf-next or net-next directly, it is not a
+   problem either if the subject line says net or net-next as target.
+   It is eventually up to the maintainers to do the delegation of
+   the patches.
+
+   If it is clear that patches should go into bpf or bpf-next tree,
+   please make sure to rebase the patches against those trees in
+   order to reduce potential conflicts.
+
+   In case the patch or patch series has to be reworked and sent out
+   again in a second or later revision, it is also required to add a
+   version number (v2, v3, ...) into the subject prefix:
+
+     git format-patch --subject-prefix='PATCH net-next v2' start..finish
+
+   When changes have been requested to the patch series, always send the
+   whole patch series again with the feedback incorporated (never send
+   individual diffs on top of the old series).
+
+Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
+
+A: It means that the patch looks good for mainline inclusion from
+   a BPF point of view.
+
+   Be aware that this is not a final verdict that the patch will
+   automatically get accepted into net or net-next trees eventually:
+
+   On the netdev kernel mailing list reviews can come in at any point
+   in time. If discussions around a patch conclude that they cannot
+   get included as-is, we will either apply a follow-up fix or drop
+   them from the trees entirely. Therefore, we also reserve to rebase
+   the trees when deemed necessary. After all, the purpose of the tree
+   is to i) accumulate and stage BPF patches for integration into trees
+   like net and net-next, and ii) run extensive BPF test suite and
+   workloads on the patches before they make their way any further.
+
+   Once the BPF pull request was accepted by David S. Miller, then
+   the patches end up in net or net-next tree, respectively, and
+   make their way from there further into mainline. Again, see the
+   netdev FAQ for additional information e.g. on how often they are
+   merged to mainline.
+
+Q: How long do I need to wait for feedback on my BPF patches?
+
+A: We try to keep the latency low. The usual time to feedback will
+   be around 2 or 3 business days. It may vary depending on the
+   complexity of changes and current patch load.
+
+Q: How often do you send pull requests to major kernel trees like
+   net or net-next?
+
+A: Pull requests will be sent out rather often in order to not
+   accumulate too many patches in bpf or bpf-next.
+
+   As a rule of thumb, expect pull requests for each tree regularly
+   at the end of the week. In some cases pull requests could additionally
+   come also in the middle of the week depending on the current patch
+   load or urgency.
+
+Q: Are patches applied to bpf-next when the merge window is open?
+
+A: For the time when the merge window is open, bpf-next will not be
+   processed. This is roughly analogous to net-next patch processing,
+   so feel free to read up on the netdev FAQ about further details.
+
+   During those two weeks of merge window, we might ask you to resend
+   your patch series once bpf-next is open again. Once Linus released
+   a v*-rc1 after the merge window, we continue processing of bpf-next.
+
+   For non-subscribers to kernel mailing lists, there is also a status
+   page run by David S. Miller on net-next that provides guidance:
+
+     http://vger.kernel.org/~davem/net-next.html
+
+Q: I made a BPF verifier change, do I need to add test cases for
+   BPF kernel selftests?
+
+A: If the patch has changes to the behavior of the verifier, then yes,
+   it is absolutely necessary to add test cases to the BPF kernel
+   selftests suite. If they are not present and we think they are
+   needed, then we might ask for them before accepting any changes.
+
+   In particular, test_verifier.c is tracking a high number of BPF test
+   cases, including a lot of corner cases that LLVM BPF back end may
+   generate out of the restricted C code. Thus, adding test cases is
+   absolutely crucial to make sure future changes do not accidentally
+   affect prior use-cases. Thus, treat those test cases as: verifier
+   behavior that is not tracked in test_verifier.c could potentially
+   be subject to change.
+
+Q: When should I add code to samples/bpf/ and when to BPF kernel
+   selftests?
+
+A: In general, we prefer additions to BPF kernel selftests rather than
+   samples/bpf/. The rationale is very simple: kernel selftests are
+   regularly run by various bots to test for kernel regressions.
+
+   The more test cases we add to BPF selftests, the better the coverage
+   and the less likely it is that those could accidentally break. It is
+   not that BPF kernel selftests cannot demo how a specific feature can
+   be used.
+
+   That said, samples/bpf/ may be a good place for people to get started,
+   so it might be advisable that simple demos of features could go into
+   samples/bpf/, but advanced functional and corner-case testing rather
+   into kernel selftests.
+
+   If your sample looks like a test case, then go for BPF kernel selftests
+   instead!
+
+Q: When should I add code to the bpftool?
+
+A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
+   a central user space tool for debugging and introspection of BPF programs
+   and maps that are active in the kernel. If UAPI changes related to BPF
+   enable for dumping additional information of programs or maps, then
+   bpftool should be extended as well to support dumping them.
+
+Q: When should I add code to iproute2's BPF loader?
+
+A: For UAPI changes related to the XDP or tc layer (e.g. cls_bpf), the
+   convention is that those control-path related changes are added to
+   iproute2's BPF loader as well from user space side. This is not only
+   useful to have UAPI changes properly designed to be usable, but also
+   to make those changes available to a wider user base of major
+   downstream distributions.
+
+Q: Do you accept patches as well for iproute2's BPF loader?
+
+A: Patches for the iproute2's BPF loader have to be sent to:
+
+     netdev@vger.kernel.org
+
+   While those patches are not processed by the BPF kernel maintainers,
+   please keep them in Cc as well, so they can be reviewed.
+
+   The official git repository for iproute2 is run by Stephen Hemminger
+   and can be found at:
+
+     https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
+
+   The patches need to have a subject prefix of '[PATCH iproute2 master]'
+   or '[PATCH iproute2 net-next]'. 'master' or 'net-next' describes the
+   target branch where the patch should be applied to. Meaning, if kernel
+   changes went into the net-next kernel tree, then the related iproute2
+   changes need to go into the iproute2 net-next branch, otherwise they
+   can be targeted at master branch. The iproute2 net-next branch will get
+   merged into the master branch after the current iproute2 version from
+   master has been released.
+
+   Like BPF, the patches end up in patchwork under the netdev project and
+   are delegated to 'shemminger' for further processing:
+
+     http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
+
+Q: What is the minimum requirement before I submit my BPF patches?
+
+A: When submitting patches, always take the time and properly test your
+   patches *prior* to submission. Never rush them! If maintainers find
+   that your patches have not been properly tested, it is a good way to
+   get them grumpy. Testing patch submissions is a hard requirement!
+
+   Note, fixes that go to bpf tree *must* have a Fixes: tag included. The
+   same applies to fixes that target bpf-next, where the affected commit
+   is in net-next (or in some cases bpf-next). The Fixes: tag is crucial
+   in order to identify follow-up commits and tremendously helps for people
+   having to do backporting, so it is a must have!
+
+   We also don't accept patches with an empty commit message. Take your
+   time and properly write up a high quality commit message, it is
+   essential!
+
+   Think about it this way: other developers looking at your code a month
+   from now need to understand *why* a certain change has been done that
+   way, and whether there have been flaws in the analysis or assumptions
+   that the original author did. Thus providing a proper rationale and
+   describing the use-case for the changes is a must.
+
+   Patch submissions with >1 patch must have a cover letter which includes
+   a high level description of the series. This high level summary will
+   then be placed into the merge commit by the BPF maintainers such that
+   it is also accessible from the git log for future reference.
+
+Q: What do I need to consider when adding a new instruction or feature
+   that would require BPF JIT and/or LLVM integration as well?
+
+A: We try hard to keep all BPF JITs up to date such that the same user
+   experience can be guaranteed when running BPF programs on different
+   architectures without having the program punt to the less efficient
+   interpreter in case the in-kernel BPF JIT is enabled.
+
+   If you are unable to implement or test the required JIT changes for
+   certain architectures, please work together with the related BPF JIT
+   developers in order to get the feature implemented in a timely manner.
+   Please refer to the git log (arch/*/net/) to locate the necessary
+   people for helping out.
+
+   Also always make sure to add BPF test cases (e.g. test_bpf.c and
+   test_verifier.c) for new instructions, so that they can receive
+   broad test coverage and help run-time testing the various BPF JITs.
+
+   In case of new BPF instructions, once the changes have been accepted
+   into the Linux kernel, please implement support into LLVM's BPF back
+   end. See LLVM section below for further information.
+
+Stable submission:
+------------------
+
+Q: I need a specific BPF commit in stable kernels. What should I do?
+
+A: In case you need a specific fix in stable kernels, first check whether
+   the commit has already been applied in the related linux-*.y branches:
+
+     https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
+
+   If not the case, then drop an email to the BPF maintainers with the
+   netdev kernel mailing list in Cc and ask for the fix to be queued up:
+
+     netdev@vger.kernel.org
+
+   The process in general is the same as on netdev itself, see also the
+   netdev FAQ document.
+
+Q: Do you also backport to kernels not currently maintained as stable?
+
+A: No. If you need a specific BPF commit in kernels that are currently not
+   maintained by the stable maintainers, then you are on your own.
+
+   The current stable and longterm stable kernels are all listed here:
+
+     https://www.kernel.org/
+
+Q: The BPF patch I am about to submit needs to go to stable as well. What
+   should I do?
+
+A: The same rules apply as with netdev patch submissions in general, see
+   netdev FAQ under:
+
+     Documentation/networking/netdev-FAQ.txt
+
+   Never add "Cc: stable@vger.kernel.org" to the patch description, but
+   ask the BPF maintainers to queue the patches instead. This can be done
+   with a note, for example, under the "---" part of the patch which does
+   not go into the git log. Alternatively, this can be done as a simple
+   request by mail instead.
+
+Q: Where do I find currently queued BPF patches that will be submitted
+   to stable?
+
+A: Once patches that fix critical bugs got applied into the bpf tree, they
+   are queued up for stable submission under:
+
+     http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
+
+   They will be on hold there at minimum until the related commit made its
+   way into the mainline kernel tree.
+
+   After having been under broader exposure, the queued patches will be
+   submitted by the BPF maintainers to the stable maintainers.
+
+Testing patches:
+----------------
+
+Q: Which BPF kernel selftests version should I run my kernel against?
+
+A: If you run a kernel xyz, then always run the BPF kernel selftests from
+   that kernel xyz as well. Do not expect that the BPF selftest from the
+   latest mainline tree will pass all the time.
+
+   In particular, test_bpf.c and test_verifier.c have a large number of
+   test cases and are constantly updated with new BPF test sequences, or
+   existing ones are adapted to verifier changes e.g. due to verifier
+   becoming smarter and being able to better track certain things.
+
+LLVM:
+-----
+
+Q: Where do I find LLVM with BPF support?
+
+A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
+
+   All major distributions these days ship LLVM with BPF back end enabled,
+   so for the majority of use-cases it is not required to compile LLVM by
+   hand anymore, just install the distribution provided package.
+
+   LLVM's static compiler lists the supported targets through 'llc --version',
+   make sure BPF targets are listed. Example:
+
+     $ llc --version
+     LLVM (http://llvm.org/):
+       LLVM version 6.0.0svn
+       Optimized build.
+       Default target: x86_64-unknown-linux-gnu
+       Host CPU: skylake
+
+       Registered Targets:
+         bpf    - BPF (host endian)
+         bpfeb  - BPF (big endian)
+         bpfel  - BPF (little endian)
+         x86    - 32-bit X86: Pentium-Pro and above
+         x86-64 - 64-bit X86: EM64T and AMD64
+
+   For developers in order to utilize the latest features added to LLVM's
+   BPF back end, it is advisable to run the latest LLVM releases. Support
+   for new BPF kernel features such as additions to the BPF instruction
+   set are often developed together.
+
+   All LLVM releases can be found at: http://releases.llvm.org/
+
+Q: Got it, so how do I build LLVM manually anyway?
+
+A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
+   that set up, proceed with building the latest LLVM and clang version
+   from the git repositories:
+
+     $ git clone http://llvm.org/git/llvm.git
+     $ cd llvm/tools
+     $ git clone --depth 1 http://llvm.org/git/clang.git
+     $ cd ..; mkdir build; cd build
+     $ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
+                -DBUILD_SHARED_LIBS=OFF           \
+                -DCMAKE_BUILD_TYPE=Release        \
+                -DLLVM_BUILD_RUNTIME=OFF
+     $ make -j $(getconf _NPROCESSORS_ONLN)
+
+   The built binaries can then be found in the build/bin/ directory, where
+   you can point the PATH variable to.
+
+Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
+   generation back end or about LLVM generated code that the verifier
+   refuses to accept?
+
+A: Yes, please do! LLVM's BPF back end is a key piece of the whole BPF
+   infrastructure and it ties deeply into verification of programs from the
+   kernel side. Therefore, any issues on either side need to be investigated
+   and fixed whenever necessary.
+
+   Therefore, please make sure to bring them up at netdev kernel mailing
+   list and Cc BPF maintainers for LLVM and kernel bits:
+
+     Yonghong Song <yhs@fb.com>
+     Alexei Starovoitov <ast@kernel.org>
+     Daniel Borkmann <daniel@iogearbox.net>
+
+   LLVM also has an issue tracker where BPF related bugs can be found:
+
+     https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
+
+   However, it is better to reach out through mailing lists with having
+   maintainers in Cc.
+
+Q: I have added a new BPF instruction to the kernel, how can I integrate
+   it into LLVM?
+
+A: LLVM has a -mcpu selector for the BPF back end in order to allow the
+   selection of BPF instruction set extensions. By default the 'generic'
+   processor target is used, which is the base instruction set (v1) of BPF.
+
+   LLVM has an option to select -mcpu=probe where it will probe the host
+   kernel for supported BPF instruction set extensions and selects the
+   optimal set automatically.
+
+   For cross-compilation, a specific version can be select manually as well.
+
+     $ llc -march bpf -mcpu=help
+     Available CPUs for this target:
+
+       generic - Select the generic processor.
+       probe   - Select the probe processor.
+       v1      - Select the v1 processor.
+       v2      - Select the v2 processor.
+     [...]
+
+   Newly added BPF instructions to the Linux kernel need to follow the same
+   scheme, bump the instruction set version and implement probing for the
+   extensions such that -mcpu=probe users can benefit from the optimization
+   transparently when upgrading their kernels.
+
+   If you are unable to implement support for the newly added BPF instruction
+   please reach out to BPF developers for help.
+
+   By the way, the BPF kernel selftests run with -mcpu=probe for better
+   test coverage.
+
+Happy BPF hacking!
diff --git a/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt b/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt
index 56d6cc336e1c..bfc0c433654f 100644
--- a/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt
+++ b/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt
@@ -18,6 +18,12 @@ Optional properties:
 
 - xceiver-supply: Regulator that powers the CAN transceiver
 
+- big-endian: This means the registers of FlexCAN controller are big endian.
+              This is optional property.i.e. if this property is not present in
+              device tree node then controller is assumed to be little endian.
+              if this property is present then controller is assumed to be big
+              endian.
+
 Example:
 
 	can@1c000 {
diff --git a/Documentation/devicetree/bindings/net/ieee802154/adf7242.txt b/Documentation/devicetree/bindings/net/ieee802154/adf7242.txt
index dea5124cdc52..d24172cc6d32 100644
--- a/Documentation/devicetree/bindings/net/ieee802154/adf7242.txt
+++ b/Documentation/devicetree/bindings/net/ieee802154/adf7242.txt
@@ -1,7 +1,7 @@
 * ADF7242 IEEE 802.15.4 *
 
 Required properties:
-  - compatible:		should be "adi,adf7242"
+  - compatible:		should be "adi,adf7242", "adi,adf7241"
   - spi-max-frequency:	maximal bus speed (12.5 MHz)
   - reg:		the chipselect index
   - interrupts:		the interrupt generated by the device via pin IRQ1.
diff --git a/Documentation/devicetree/bindings/net/phy.txt b/Documentation/devicetree/bindings/net/phy.txt
index 77d0b2a61ffa..c05479f5ac7c 100644
--- a/Documentation/devicetree/bindings/net/phy.txt
+++ b/Documentation/devicetree/bindings/net/phy.txt
@@ -53,6 +53,8 @@ Optional Properties:
   to ensure the integrated PHY is used. The absence of this property indicates
   the muxers should be configured so that the external PHY is used.
 
+- reset-gpios: The GPIO phandle and specifier for the PHY reset signal.
+
 Example:
 
 ethernet-phy@0 {
diff --git a/Documentation/networking/dsa/dsa.txt b/Documentation/networking/dsa/dsa.txt
index b8b40753133e..25170ad7d25b 100644
--- a/Documentation/networking/dsa/dsa.txt
+++ b/Documentation/networking/dsa/dsa.txt
@@ -385,11 +385,6 @@ Switch configuration
   avoid relying on what a previous software agent such as a bootloader/firmware
   may have previously configured.
 
-- set_addr: Some switches require the programming of the management interface's
-  Ethernet MAC address, switch drivers can also disable ageing of MAC addresses
-  on the management interface and "hardcode"/"force" this MAC address for the
-  CPU/management interface as an optimization
-
 PHY devices and link management
 -------------------------------
 
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt
index 057e9fdbfac9..e74d8e1da0e2 100644
--- a/Documentation/networking/ieee802154.txt
+++ b/Documentation/networking/ieee802154.txt
@@ -97,6 +97,46 @@ The include/net/mac802154.h defines following functions:
  - void ieee802154_unregister_hw(struct ieee802154_hw *hw):
    freeing registered PHY
 
+ - void ieee802154_rx_irqsafe(struct ieee802154_hw *hw, struct sk_buff *skb,
+                              u8 lqi):
+   telling 802.15.4 module there is a new received frame in the skb with
+   the RF Link Quality Indicator (LQI) from the hardware device
+
+ - void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
+                                 bool ifs_handling):
+   telling 802.15.4 module the frame in the skb is or going to be
+   transmitted through the hardware device
+
+The device driver must implement the following callbacks in the IEEE 802.15.4
+operations structure at least:
+struct ieee802154_ops {
+	...
+	int	(*start)(struct ieee802154_hw *hw);
+	void	(*stop)(struct ieee802154_hw *hw);
+	...
+	int	(*xmit_async)(struct ieee802154_hw *hw, struct sk_buff *skb);
+	int	(*ed)(struct ieee802154_hw *hw, u8 *level);
+	int	(*set_channel)(struct ieee802154_hw *hw, u8 page, u8 channel);
+	...
+};
+
+ - int start(struct ieee802154_hw *hw):
+   handler that 802.15.4 module calls for the hardware device initialization.
+
+ - void stop(struct ieee802154_hw *hw):
+   handler that 802.15.4 module calls for the hardware device cleanup.
+
+ - int xmit_async(struct ieee802154_hw *hw, struct sk_buff *skb):
+   handler that 802.15.4 module calls for each frame in the skb going to be
+   transmitted through the hardware device.
+
+ - int ed(struct ieee802154_hw *hw, u8 *level):
+   handler that 802.15.4 module calls for Energy Detection from the hardware
+   device.
+
+ - int set_channel(struct ieee802154_hw *hw, u8 page, u8 channel):
+   set radio for listening on specific channel of the hardware device.
+
 Moreover IEEE 802.15.4 device operations structure should be filled.
 
 Fake drivers
diff --git a/Documentation/networking/kapi.rst b/Documentation/networking/kapi.rst
index 580289f345da..f03ae64be8bc 100644
--- a/Documentation/networking/kapi.rst
+++ b/Documentation/networking/kapi.rst
@@ -145,3 +145,27 @@ PHY Support
 
 .. kernel-doc:: drivers/net/phy/mdio_bus.c
    :internal:
+
+PHYLINK
+-------
+
+  PHYLINK interfaces traditional network drivers with PHYLIB, fixed-links,
+  and SFF modules (eg, hot-pluggable SFP) that may contain PHYs.  PHYLINK
+  provides management of the link state and link modes.
+
+.. kernel-doc:: include/linux/phylink.h
+   :internal:
+
+.. kernel-doc:: drivers/net/phy/phylink.c
+
+SFP support
+-----------
+
+.. kernel-doc:: drivers/net/phy/sfp-bus.c
+   :internal:
+
+.. kernel-doc:: include/linux/sfp.h
+   :internal:
+
+.. kernel-doc:: drivers/net/phy/sfp-bus.c
+   :export:
diff --git a/MAINTAINERS b/MAINTAINERS
index 9e0045e3ee0c..a5c0fd3bed32 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2725,12 +2725,16 @@ M:	Alexei Starovoitov <ast@kernel.org>
 M:	Daniel Borkmann <daniel@iogearbox.net>
 L:	netdev@vger.kernel.org
 L:	linux-kernel@vger.kernel.org
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
 S:	Supported
 F:	arch/x86/net/bpf_jit*
 F:	Documentation/networking/filter.txt
 F:	Documentation/bpf/
 F:	include/linux/bpf*
 F:	include/linux/filter.h
+F:	include/trace/events/bpf.h
+F:	include/trace/events/xdp.h
 F:	include/uapi/linux/bpf*
 F:	include/uapi/linux/filter.h
 F:	kernel/bpf/
@@ -9600,6 +9604,11 @@ NETWORKING [WIRELESS]
 L:	linux-wireless@vger.kernel.org
 Q:	http://patchwork.kernel.org/project/linux-wireless/list/
 
+NETDEVSIM
+M:	Jakub Kicinski <jakub.kicinski@netronome.com>
+S:	Maintained
+F:	drivers/net/netdevsim/*
+
 NETXEN (1/10) GbE SUPPORT
 M:	Manish Chopra <manish.chopra@cavium.com>
 M:	Rahul Verma <rahul.verma@cavium.com>
diff --git a/arch/arm/boot/dts/imx25.dtsi b/arch/arm/boot/dts/imx25.dtsi
index 09ce8b81fafa..fcaff1c66bcb 100644
--- a/arch/arm/boot/dts/imx25.dtsi
+++ b/arch/arm/boot/dts/imx25.dtsi
@@ -122,7 +122,7 @@
 			};
 
 			can1: can@43f88000 {
-				compatible = "fsl,imx25-flexcan", "fsl,p1010-flexcan";
+				compatible = "fsl,imx25-flexcan";
 				reg = <0x43f88000 0x4000>;
 				interrupts = <43>;
 				clocks = <&clks 75>, <&clks 75>;
@@ -131,7 +131,7 @@
 			};
 
 			can2: can@43f8c000 {
-				compatible = "fsl,imx25-flexcan", "fsl,p1010-flexcan";
+				compatible = "fsl,imx25-flexcan";
 				reg = <0x43f8c000 0x4000>;
 				interrupts = <44>;
 				clocks = <&clks 76>, <&clks 76>;
diff --git a/arch/arm/boot/dts/imx28.dtsi b/arch/arm/boot/dts/imx28.dtsi
index 2f4ebe0318d3..e52e05c0fe56 100644
--- a/arch/arm/boot/dts/imx28.dtsi
+++ b/arch/arm/boot/dts/imx28.dtsi
@@ -1038,7 +1038,7 @@
 			};
 
 			can0: can@80032000 {
-				compatible = "fsl,imx28-flexcan", "fsl,p1010-flexcan";
+				compatible = "fsl,imx28-flexcan";
 				reg = <0x80032000 0x2000>;
 				interrupts = <8>;
 				clocks = <&clks 58>, <&clks 58>;
@@ -1047,7 +1047,7 @@
 			};
 
 			can1: can@80034000 {
-				compatible = "fsl,imx28-flexcan", "fsl,p1010-flexcan";
+				compatible = "fsl,imx28-flexcan";
 				reg = <0x80034000 0x2000>;
 				interrupts = <9>;
 				clocks = <&clks 59>, <&clks 59>;
diff --git a/arch/arm/boot/dts/imx35.dtsi b/arch/arm/boot/dts/imx35.dtsi
index 6d5e6a60bee7..1f0e2203b576 100644
--- a/arch/arm/boot/dts/imx35.dtsi
+++ b/arch/arm/boot/dts/imx35.dtsi
@@ -303,7 +303,7 @@
 			};
 
 			can1: can@53fe4000 {
-				compatible = "fsl,imx35-flexcan", "fsl,p1010-flexcan";
+				compatible = "fsl,imx35-flexcan";
 				reg = <0x53fe4000 0x1000>;
 				clocks = <&clks 33>, <&clks 33>;
 				clock-names = "ipg", "per";
@@ -312,7 +312,7 @@
 			};
 
 			can2: can@53fe8000 {
-				compatible = "fsl,imx35-flexcan", "fsl,p1010-flexcan";
+				compatible = "fsl,imx35-flexcan";
 				reg = <0x53fe8000 0x1000>;
 				clocks = <&clks 34>, <&clks 34>;
 				clock-names = "ipg", "per";
diff --git a/arch/arm/boot/dts/imx53.dtsi b/arch/arm/boot/dts/imx53.dtsi
index 589a67c5f796..a5a050703320 100644
--- a/arch/arm/boot/dts/imx53.dtsi
+++ b/arch/arm/boot/dts/imx53.dtsi
@@ -545,7 +545,7 @@
 			};
 
 			can1: can@53fc8000 {
-				compatible = "fsl,imx53-flexcan", "fsl,p1010-flexcan";
+				compatible = "fsl,imx53-flexcan";
 				reg = <0x53fc8000 0x4000>;
 				interrupts = <82>;
 				clocks = <&clks IMX5_CLK_CAN1_IPG_GATE>,
@@ -555,7 +555,7 @@
 			};
 
 			can2: can@53fcc000 {
-				compatible = "fsl,imx53-flexcan", "fsl,p1010-flexcan";
+				compatible = "fsl,imx53-flexcan";
 				reg = <0x53fcc000 0x4000>;
 				interrupts = <83>;
 				clocks = <&clks IMX5_CLK_CAN2_IPG_GATE>,
diff --git a/arch/arm/boot/dts/ls1021a-qds.dts b/arch/arm/boot/dts/ls1021a-qds.dts
index 940875316d0f..4f211e3c903a 100644
--- a/arch/arm/boot/dts/ls1021a-qds.dts
+++ b/arch/arm/boot/dts/ls1021a-qds.dts
@@ -331,3 +331,19 @@
 &uart1 {
 	status = "okay";
 };
+
+&can0 {
+	status = "okay";
+};
+
+&can1 {
+	status = "okay";
+};
+
+&can2 {
+	status = "disabled";
+};
+
+&can3 {
+	status = "disabled";
+};
diff --git a/arch/arm/boot/dts/ls1021a-twr.dts b/arch/arm/boot/dts/ls1021a-twr.dts
index a8b148ad1dd2..7202d9c504be 100644
--- a/arch/arm/boot/dts/ls1021a-twr.dts
+++ b/arch/arm/boot/dts/ls1021a-twr.dts
@@ -243,3 +243,19 @@
 &uart1 {
 	status = "okay";
 };
+
+&can0 {
+	status = "okay";
+};
+
+&can1 {
+	status = "okay";
+};
+
+&can2 {
+	status = "disabled";
+};
+
+&can3 {
+	status = "disabled";
+};
diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 9319e1f0f1d8..7789031898b0 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -730,5 +730,41 @@
 					<0000 0 0 3 &gic GIC_SPI 191 IRQ_TYPE_LEVEL_HIGH>,
 					<0000 0 0 4 &gic GIC_SPI 193 IRQ_TYPE_LEVEL_HIGH>;
 		};
+
+		can0: can@2a70000 {
+			compatible = "fsl,ls1021ar2-flexcan";
+			reg = <0x0 0x2a70000 0x0 0x1000>;
+			interrupts = <GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>;
+			clocks = <&clockgen 4 1>, <&clockgen 4 1>;
+			clock-names = "ipg", "per";
+			big-endian;
+		};
+
+		can1: can@2a80000 {
+			compatible = "fsl,ls1021ar2-flexcan";
+			reg = <0x0 0x2a80000 0x0 0x1000>;
+			interrupts = <GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>;
+			clocks = <&clockgen 4 1>, <&clockgen 4 1>;
+			clock-names = "ipg", "per";
+			big-endian;
+		};
+
+		can2: can@2a90000 {
+			compatible = "fsl,ls1021ar2-flexcan";
+			reg = <0x0 0x2a90000 0x0 0x1000>;
+			interrupts = <GIC_SPI 128 IRQ_TYPE_LEVEL_HIGH>;
+			clocks = <&clockgen 4 1>, <&clockgen 4 1>;
+			clock-names = "ipg", "per";
+			big-endian;
+		};
+
+		can3: can@2aa0000 {
+			compatible = "fsl,ls1021ar2-flexcan";
+			reg = <0x0 0x2aa0000 0x0 0x1000>;
+			interrupts = <GIC_SPI 129 IRQ_TYPE_LEVEL_HIGH>;
+			clocks = <&clockgen 4 1>, <&clockgen 4 1>;
+			clock-names = "ipg", "per";
+			big-endian;
+		};
 	};
 };
diff --git a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi
index af12ead88c5f..1b4aafc1f6a2 100644
--- a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi
@@ -137,12 +137,14 @@
 		compatible = "fsl,p1010-flexcan";
 		reg = <0x1c000 0x1000>;
 		interrupts = <48 0x2 0 0>;
+		big-endian;
 	};
 
 	can1: can@1d000 {
 		compatible = "fsl,p1010-flexcan";
 		reg = <0x1d000 0x1000>;
 		interrupts = <61 0x2 0 0>;
+		big-endian;
 	};
 
 	L2: l2-cache-controller@20000 {
diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c
index ce47eb17901d..6470e3c4c990 100644
--- a/drivers/atm/eni.c
+++ b/drivers/atm/eni.c
@@ -473,7 +473,7 @@ static int do_rx_dma(struct atm_vcc *vcc,struct sk_buff *skb,
 		ENI_PRV_POS(skb) = eni_vcc->descr+size+1;
 		skb_queue_tail(&eni_dev->rx_queue,skb);
 		eni_vcc->last = skb;
-rx_enqueued++;
+		rx_enqueued++;
 	}
 	eni_vcc->descr = here;
 	eni_out(dma_wr,MID_DMA_WR_RX);
@@ -715,7 +715,7 @@ static void get_service(struct atm_dev *dev)
 			else eni_dev->slow = vcc;
 			eni_dev->last_slow = vcc;
 		}
-putting++;
+		putting++;
 		ENI_VCC(vcc)->servicing++;
 	}
 }
@@ -744,7 +744,7 @@ static void dequeue_rx(struct atm_dev *dev)
 		}
 		EVENT("dequeued (size=%ld,pos=0x%lx)\n",ENI_PRV_SIZE(skb),
 		    ENI_PRV_POS(skb));
-rx_dequeued++;
+		rx_dequeued++;
 		vcc = ATM_SKB(skb)->vcc;
 		eni_vcc = ENI_VCC(vcc);
 		first = 0;
@@ -1174,7 +1174,7 @@ DPRINTK("doing direct send\n"); /* @@@ well, this doesn't work anyway */
 	DPRINTK("dma_wr set to %d, tx_pos is now %ld\n",dma_wr,tx->tx_pos);
 	eni_out(dma_wr,MID_DMA_WR_TX);
 	skb_queue_tail(&eni_dev->tx_queue,skb);
-queued++;
+	queued++;
 	return enq_ok;
 }
 
@@ -1195,7 +1195,7 @@ static void poll_tx(struct atm_dev *dev)
 				if (res == enq_ok) continue;
 				DPRINTK("re-queuing TX PDU\n");
 				skb_queue_head(&tx->backlog,skb);
-requeued++;
+				requeued++;
 				if (res == enq_jam) return;
 				break;
 			}
@@ -1232,7 +1232,7 @@ static void dequeue_tx(struct atm_dev *dev)
 		else dev_kfree_skb_irq(skb);
 		atomic_inc(&vcc->stats->tx);
 		wake_up(&eni_dev->tx_wait);
-dma_complete++;
+		dma_complete++;
 	}
 }
 
@@ -1555,7 +1555,7 @@ static void eni_tasklet(unsigned long data)
 	}
 	if (events & MID_TX_COMPLETE) {
 		EVENT("INT: TX COMPLETE\n",0,0);
-tx_complete++;
+		tx_complete++;
 		wake_up(&eni_dev->tx_wait);
 		/* poll_rx ? */
 	}
@@ -2069,14 +2069,14 @@ static int eni_send(struct atm_vcc *vcc,struct sk_buff *skb)
 		}
 		*(u32 *) skb->data = htonl(*(u32 *) skb->data);
 	}
-submitted++;
+	submitted++;
 	ATM_SKB(skb)->vcc = vcc;
 	tasklet_disable(&ENI_DEV(vcc->dev)->task);
 	res = do_tx(skb);
 	tasklet_enable(&ENI_DEV(vcc->dev)->task);
 	if (res == enq_ok) return 0;
 	skb_queue_tail(&ENI_VCC(vcc)->tx->backlog,skb);
-backlogged++;
+	backlogged++;
 	tasklet_schedule(&ENI_DEV(vcc->dev)->task);
 	return 0;
 }
diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 12eb8caa4263..50e071444a5c 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -140,6 +140,29 @@ static u32 hv_copyto_ringbuffer(
 	return start_write_offset;
 }
 
+/*
+ *
+ * hv_get_ringbuffer_availbytes()
+ *
+ * Get number of bytes available to read and to write to
+ * for the specified ring buffer
+ */
+static void
+hv_get_ringbuffer_availbytes(const struct hv_ring_buffer_info *rbi,
+			     u32 *read, u32 *write)
+{
+	u32 read_loc, write_loc, dsize;
+
+	/* Capture the read/write indices before they changed */
+	read_loc = READ_ONCE(rbi->ring_buffer->read_index);
+	write_loc = READ_ONCE(rbi->ring_buffer->write_index);
+	dsize = rbi->ring_datasize;
+
+	*write = write_loc >= read_loc ? dsize - (write_loc - read_loc) :
+		read_loc - write_loc;
+	*read = dsize - *write;
+}
+
 /* Get various debug metrics for the specified ring buffer. */
 void hv_ringbuffer_get_debuginfo(const struct hv_ring_buffer_info *ring_info,
 				 struct hv_ring_buffer_debug_info *debug_info)
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 0936da592e12..944ec3c9282c 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -497,4 +497,15 @@ config THUNDERBOLT_NET
 
 source "drivers/net/hyperv/Kconfig"
 
+config NETDEVSIM
+	tristate "Simulated networking device"
+	depends on DEBUG_FS
+	help
+	  This driver is a developer testing tool and software model that can
+	  be used to test various control path networking APIs, especially
+	  HW-offload related.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called netdevsim.
+
 endif # NETDEVICES
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 766f62d02a0b..04c3b747812c 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -78,3 +78,4 @@ obj-$(CONFIG_FUJITSU_ES) += fjes/
 
 thunderbolt-net-y += thunderbolt.o
 obj-$(CONFIG_THUNDERBOLT_NET) += thunderbolt-net.o
+obj-$(CONFIG_NETDEVSIM) += netdevsim/
diff --git a/drivers/net/can/c_can/c_can_pci.c b/drivers/net/can/c_can/c_can_pci.c
index d065c0e2d18e..406b4847e5dc 100644
--- a/drivers/net/can/c_can/c_can_pci.c
+++ b/drivers/net/can/c_can/c_can_pci.c
@@ -251,14 +251,14 @@ static void c_can_pci_remove(struct pci_dev *pdev)
 	pci_disable_device(pdev);
 }
 
-static struct c_can_pci_data c_can_sta2x11= {
+static const struct c_can_pci_data c_can_sta2x11= {
 	.type = BOSCH_C_CAN,
 	.reg_align = C_CAN_REG_ALIGN_32,
 	.freq = 52000000, /* 52 Mhz */
 	.bar = 0,
 };
 
-static struct c_can_pci_data c_can_pch = {
+static const struct c_can_pci_data c_can_pch = {
 	.type = BOSCH_C_CAN,
 	.reg_align = C_CAN_REG_32,
 	.freq = 50000000, /* 50 MHz */
diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index 0626dcfd1f3d..3cd371c94e83 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -190,6 +190,7 @@
  *   MX53  FlexCAN2  03.00.00.00    yes        no        no       no        no
  *   MX6s  FlexCAN3  10.00.12.00    yes       yes        no       no       yes
  *   VF610 FlexCAN3  ?               no       yes        no      yes       yes?
+ * LS1021A FlexCAN2  03.00.04.00     no       yes        no       no       yes
  *
  * Some SOCs do not have the RX_WARN & TX_WARN interrupt line connected.
  */
@@ -279,6 +280,10 @@ struct flexcan_priv {
 	struct clk *clk_per;
 	const struct flexcan_devtype_data *devtype_data;
 	struct regulator *reg_xceiver;
+
+	/* Read and Write APIs */
+	u32 (*read)(void __iomem *addr);
+	void (*write)(u32 val, void __iomem *addr);
 };
 
 static const struct flexcan_devtype_data fsl_p1010_devtype_data = {
@@ -301,6 +306,12 @@ static const struct flexcan_devtype_data fsl_vf610_devtype_data = {
 		FLEXCAN_QUIRK_BROKEN_PERR_STATE,
 };
 
+static const struct flexcan_devtype_data fsl_ls1021a_r2_devtype_data = {
+	.quirks = FLEXCAN_QUIRK_DISABLE_RXFG | FLEXCAN_QUIRK_ENABLE_EACEN_RRS |
+		FLEXCAN_QUIRK_DISABLE_MECR | FLEXCAN_QUIRK_BROKEN_PERR_STATE |
+		FLEXCAN_QUIRK_USE_OFF_TIMESTAMP,
+};
+
 static const struct can_bittiming_const flexcan_bittiming_const = {
 	.name = DRV_NAME,
 	.tseg1_min = 4,
@@ -313,39 +324,45 @@ static const struct can_bittiming_const flexcan_bittiming_const = {
 	.brp_inc = 1,
 };
 
-/* Abstract off the read/write for arm versus ppc. This
- * assumes that PPC uses big-endian registers and everything
- * else uses little-endian registers, independent of CPU
- * endianness.
+/* FlexCAN module is essentially modelled as a little-endian IP in most
+ * SoCs, i.e the registers as well as the message buffer areas are
+ * implemented in a little-endian fashion.
+ *
+ * However there are some SoCs (e.g. LS1021A) which implement the FlexCAN
+ * module in a big-endian fashion (i.e the registers as well as the
+ * message buffer areas are implemented in a big-endian way).
+ *
+ * In addition, the FlexCAN module can be found on SoCs having ARM or
+ * PPC cores. So, we need to abstract off the register read/write
+ * functions, ensuring that these cater to all the combinations of module
+ * endianness and underlying CPU endianness.
  */
-#if defined(CONFIG_PPC)
-static inline u32 flexcan_read(void __iomem *addr)
+static inline u32 flexcan_read_be(void __iomem *addr)
 {
-	return in_be32(addr);
+	return ioread32be(addr);
 }
 
-static inline void flexcan_write(u32 val, void __iomem *addr)
+static inline void flexcan_write_be(u32 val, void __iomem *addr)
 {
-	out_be32(addr, val);
+	iowrite32be(val, addr);
 }
-#else
-static inline u32 flexcan_read(void __iomem *addr)
+
+static inline u32 flexcan_read_le(void __iomem *addr)
 {
-	return readl(addr);
+	return ioread32(addr);
 }
 
-static inline void flexcan_write(u32 val, void __iomem *addr)
+static inline void flexcan_write_le(u32 val, void __iomem *addr)
 {
-	writel(val, addr);
+	iowrite32(val, addr);
 }
-#endif
 
 static inline void flexcan_error_irq_enable(const struct flexcan_priv *priv)
 {
 	struct flexcan_regs __iomem *regs = priv->regs;
 	u32 reg_ctrl = (priv->reg_ctrl_default | FLEXCAN_CTRL_ERR_MSK);
 
-	flexcan_write(reg_ctrl, &regs->ctrl);
+	priv->write(reg_ctrl, &regs->ctrl);
 }
 
 static inline void flexcan_error_irq_disable(const struct flexcan_priv *priv)
@@ -353,7 +370,7 @@ static inline void flexcan_error_irq_disable(const struct flexcan_priv *priv)
 	struct flexcan_regs __iomem *regs = priv->regs;
 	u32 reg_ctrl = (priv->reg_ctrl_default & ~FLEXCAN_CTRL_ERR_MSK);
 
-	flexcan_write(reg_ctrl, &regs->ctrl);
+	priv->write(reg_ctrl, &regs->ctrl);
 }
 
 static inline int flexcan_transceiver_enable(const struct flexcan_priv *priv)
@@ -378,14 +395,14 @@ static int flexcan_chip_enable(struct flexcan_priv *priv)
 	unsigned int timeout = FLEXCAN_TIMEOUT_US / 10;
 	u32 reg;
 
-	reg = flexcan_read(&regs->mcr);
+	reg = priv->read(&regs->mcr);
 	reg &= ~FLEXCAN_MCR_MDIS;
-	flexcan_write(reg, &regs->mcr);
+	priv->write(reg, &regs->mcr);
 
-	while (timeout-- && (flexcan_read(&regs->mcr) & FLEXCAN_MCR_LPM_ACK))
+	while (timeout-- && (priv->read(&regs->mcr) & FLEXCAN_MCR_LPM_ACK))
 		udelay(10);
 
-	if (flexcan_read(&regs->mcr) & FLEXCAN_MCR_LPM_ACK)
+	if (priv->read(&regs->mcr) & FLEXCAN_MCR_LPM_ACK)
 		return -ETIMEDOUT;
 
 	return 0;
@@ -397,14 +414,14 @@ static int flexcan_chip_disable(struct flexcan_priv *priv)
 	unsigned int timeout = FLEXCAN_TIMEOUT_US / 10;
 	u32 reg;
 
-	reg = flexcan_read(&regs->mcr);
+	reg = priv->read(&regs->mcr);
 	reg |= FLEXCAN_MCR_MDIS;
-	flexcan_write(reg, &regs->mcr);
+	priv->write(reg, &regs->mcr);
 
-	while (timeout-- && !(flexcan_read(&regs->mcr) & FLEXCAN_MCR_LPM_ACK))
+	while (timeout-- && !(priv->read(&regs->mcr) & FLEXCAN_MCR_LPM_ACK))
 		udelay(10);
 
-	if (!(flexcan_read(&regs->mcr) & FLEXCAN_MCR_LPM_ACK))
+	if (!(priv->read(&regs->mcr) & FLEXCAN_MCR_LPM_ACK))
 		return -ETIMEDOUT;
 
 	return 0;
@@ -416,14 +433,14 @@ static int flexcan_chip_freeze(struct flexcan_priv *priv)
 	unsigned int timeout = 1000 * 1000 * 10 / priv->can.bittiming.bitrate;
 	u32 reg;
 
-	reg = flexcan_read(&regs->mcr);
+	reg = priv->read(&regs->mcr);
 	reg |= FLEXCAN_MCR_HALT;
-	flexcan_write(reg, &regs->mcr);
+	priv->write(reg, &regs->mcr);
 
-	while (timeout-- && !(flexcan_read(&regs->mcr) & FLEXCAN_MCR_FRZ_ACK))
+	while (timeout-- && !(priv->read(&regs->mcr) & FLEXCAN_MCR_FRZ_ACK))
 		udelay(100);
 
-	if (!(flexcan_read(&regs->mcr) & FLEXCAN_MCR_FRZ_ACK))
+	if (!(priv->read(&regs->mcr) & FLEXCAN_MCR_FRZ_ACK))
 		return -ETIMEDOUT;
 
 	return 0;
@@ -435,14 +452,14 @@ static int flexcan_chip_unfreeze(struct flexcan_priv *priv)
 	unsigned int timeout = FLEXCAN_TIMEOUT_US / 10;
 	u32 reg;
 
-	reg = flexcan_read(&regs->mcr);
+	reg = priv->read(&regs->mcr);
 	reg &= ~FLEXCAN_MCR_HALT;
-	flexcan_write(reg, &regs->mcr);
+	priv->write(reg, &regs->mcr);
 
-	while (timeout-- && (flexcan_read(&regs->mcr) & FLEXCAN_MCR_FRZ_ACK))
+	while (timeout-- && (priv->read(&regs->mcr) & FLEXCAN_MCR_FRZ_ACK))
 		udelay(10);
 
-	if (flexcan_read(&regs->mcr) & FLEXCAN_MCR_FRZ_ACK)
+	if (priv->read(&regs->mcr) & FLEXCAN_MCR_FRZ_ACK)
 		return -ETIMEDOUT;
 
 	return 0;
@@ -453,11 +470,11 @@ static int flexcan_chip_softreset(struct flexcan_priv *priv)
 	struct flexcan_regs __iomem *regs = priv->regs;
 	unsigned int timeout = FLEXCAN_TIMEOUT_US / 10;
 
-	flexcan_write(FLEXCAN_MCR_SOFTRST, &regs->mcr);
-	while (timeout-- && (flexcan_read(&regs->mcr) & FLEXCAN_MCR_SOFTRST))
+	priv->write(FLEXCAN_MCR_SOFTRST, &regs->mcr);
+	while (timeout-- && (priv->read(&regs->mcr) & FLEXCAN_MCR_SOFTRST))
 		udelay(10);
 
-	if (flexcan_read(&regs->mcr) & FLEXCAN_MCR_SOFTRST)
+	if (priv->read(&regs->mcr) & FLEXCAN_MCR_SOFTRST)
 		return -ETIMEDOUT;
 
 	return 0;
@@ -468,7 +485,7 @@ static int __flexcan_get_berr_counter(const struct net_device *dev,
 {
 	const struct flexcan_priv *priv = netdev_priv(dev);
 	struct flexcan_regs __iomem *regs = priv->regs;
-	u32 reg = flexcan_read(&regs->ecr);
+	u32 reg = priv->read(&regs->ecr);
 
 	bec->txerr = (reg >> 0) & 0xff;
 	bec->rxerr = (reg >> 8) & 0xff;
@@ -524,24 +541,24 @@ static int flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	if (cf->can_dlc > 0) {
 		data = be32_to_cpup((__be32 *)&cf->data[0]);
-		flexcan_write(data, &priv->tx_mb->data[0]);
+		priv->write(data, &priv->tx_mb->data[0]);
 	}
 	if (cf->can_dlc > 3) {
 		data = be32_to_cpup((__be32 *)&cf->data[4]);
-		flexcan_write(data, &priv->tx_mb->data[1]);
+		priv->write(data, &priv->tx_mb->data[1]);
 	}
 
 	can_put_echo_skb(skb, dev, 0);
 
-	flexcan_write(can_id, &priv->tx_mb->can_id);
-	flexcan_write(ctrl, &priv->tx_mb->can_ctrl);
+	priv->write(can_id, &priv->tx_mb->can_id);
+	priv->write(ctrl, &priv->tx_mb->can_ctrl);
 
 	/* Errata ERR005829 step8:
 	 * Write twice INACTIVE(0x8) code to first MB.
 	 */
-	flexcan_write(FLEXCAN_MB_CODE_TX_INACTIVE,
+	priv->write(FLEXCAN_MB_CODE_TX_INACTIVE,
 		      &priv->tx_mb_reserved->can_ctrl);
-	flexcan_write(FLEXCAN_MB_CODE_TX_INACTIVE,
+	priv->write(FLEXCAN_MB_CODE_TX_INACTIVE,
 		      &priv->tx_mb_reserved->can_ctrl);
 
 	return NETDEV_TX_OK;
@@ -660,7 +677,7 @@ static unsigned int flexcan_mailbox_read(struct can_rx_offload *offload,
 		u32 code;
 
 		do {
-			reg_ctrl = flexcan_read(&mb->can_ctrl);
+			reg_ctrl = priv->read(&mb->can_ctrl);
 		} while (reg_ctrl & FLEXCAN_MB_CODE_RX_BUSY_BIT);
 
 		/* is this MB empty? */
@@ -675,17 +692,17 @@ static unsigned int flexcan_mailbox_read(struct can_rx_offload *offload,
 			offload->dev->stats.rx_errors++;
 		}
 	} else {
-		reg_iflag1 = flexcan_read(&regs->iflag1);
+		reg_iflag1 = priv->read(&regs->iflag1);
 		if (!(reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE))
 			return 0;
 
-		reg_ctrl = flexcan_read(&mb->can_ctrl);
+		reg_ctrl = priv->read(&mb->can_ctrl);
 	}
 
 	/* increase timstamp to full 32 bit */
 	*timestamp = reg_ctrl << 16;
 
-	reg_id = flexcan_read(&mb->can_id);
+	reg_id = priv->read(&mb->can_id);
 	if (reg_ctrl & FLEXCAN_MB_CNT_IDE)
 		cf->can_id = ((reg_id >> 0) & CAN_EFF_MASK) | CAN_EFF_FLAG;
 	else
@@ -695,19 +712,19 @@ static unsigned int flexcan_mailbox_read(struct can_rx_offload *offload,
 		cf->can_id |= CAN_RTR_FLAG;
 	cf->can_dlc = get_can_dlc((reg_ctrl >> 16) & 0xf);
 
-	*(__be32 *)(cf->data + 0) = cpu_to_be32(flexcan_read(&mb->data[0]));
-	*(__be32 *)(cf->data + 4) = cpu_to_be32(flexcan_read(&mb->data[1]));
+	*(__be32 *)(cf->data + 0) = cpu_to_be32(priv->read(&mb->data[0]));
+	*(__be32 *)(cf->data + 4) = cpu_to_be32(priv->read(&mb->data[1]));
 
 	/* mark as read */
 	if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP) {
 		/* Clear IRQ */
 		if (n < 32)
-			flexcan_write(BIT(n), &regs->iflag1);
+			priv->write(BIT(n), &regs->iflag1);
 		else
-			flexcan_write(BIT(n - 32), &regs->iflag2);
+			priv->write(BIT(n - 32), &regs->iflag2);
 	} else {
-		flexcan_write(FLEXCAN_IFLAG_RX_FIFO_AVAILABLE, &regs->iflag1);
-		flexcan_read(&regs->timer);
+		priv->write(FLEXCAN_IFLAG_RX_FIFO_AVAILABLE, &regs->iflag1);
+		priv->read(&regs->timer);
 	}
 
 	return 1;
@@ -719,8 +736,8 @@ static inline u64 flexcan_read_reg_iflag_rx(struct flexcan_priv *priv)
 	struct flexcan_regs __iomem *regs = priv->regs;
 	u32 iflag1, iflag2;
 
-	iflag2 = flexcan_read(&regs->iflag2) & priv->reg_imask2_default;
-	iflag1 = flexcan_read(&regs->iflag1) & priv->reg_imask1_default &
+	iflag2 = priv->read(&regs->iflag2) & priv->reg_imask2_default;
+	iflag1 = priv->read(&regs->iflag1) & priv->reg_imask1_default &
 		~FLEXCAN_IFLAG_MB(priv->tx_mb_idx);
 
 	return (u64)iflag2 << 32 | iflag1;
@@ -736,7 +753,7 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
 	u32 reg_iflag1, reg_esr;
 	enum can_state last_state = priv->can.state;
 
-	reg_iflag1 = flexcan_read(&regs->iflag1);
+	reg_iflag1 = priv->read(&regs->iflag1);
 
 	/* reception interrupt */
 	if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP) {
@@ -759,7 +776,8 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
 		/* FIFO overflow interrupt */
 		if (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_OVERFLOW) {
 			handled = IRQ_HANDLED;
-			flexcan_write(FLEXCAN_IFLAG_RX_FIFO_OVERFLOW, &regs->iflag1);
+			priv->write(FLEXCAN_IFLAG_RX_FIFO_OVERFLOW,
+				    &regs->iflag1);
 			dev->stats.rx_over_errors++;
 			dev->stats.rx_errors++;
 		}
@@ -773,18 +791,18 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
 		can_led_event(dev, CAN_LED_EVENT_TX);
 
 		/* after sending a RTR frame MB is in RX mode */
-		flexcan_write(FLEXCAN_MB_CODE_TX_INACTIVE,
-			      &priv->tx_mb->can_ctrl);
-		flexcan_write(FLEXCAN_IFLAG_MB(priv->tx_mb_idx), &regs->iflag1);
+		priv->write(FLEXCAN_MB_CODE_TX_INACTIVE,
+			    &priv->tx_mb->can_ctrl);
+		priv->write(FLEXCAN_IFLAG_MB(priv->tx_mb_idx), &regs->iflag1);
 		netif_wake_queue(dev);
 	}
 
-	reg_esr = flexcan_read(&regs->esr);
+	reg_esr = priv->read(&regs->esr);
 
 	/* ACK all bus error and state change IRQ sources */
 	if (reg_esr & FLEXCAN_ESR_ALL_INT) {
 		handled = IRQ_HANDLED;
-		flexcan_write(reg_esr & FLEXCAN_ESR_ALL_INT, &regs->esr);
+		priv->write(reg_esr & FLEXCAN_ESR_ALL_INT, &regs->esr);
 	}
 
 	/* state change interrupt or broken error state quirk fix is enabled */
@@ -846,7 +864,7 @@ static void flexcan_set_bittiming(struct net_device *dev)
 	struct flexcan_regs __iomem *regs = priv->regs;
 	u32 reg;
 
-	reg = flexcan_read(&regs->ctrl);
+	reg = priv->read(&regs->ctrl);
 	reg &= ~(FLEXCAN_CTRL_PRESDIV(0xff) |
 		 FLEXCAN_CTRL_RJW(0x3) |
 		 FLEXCAN_CTRL_PSEG1(0x7) |
@@ -870,11 +888,11 @@ static void flexcan_set_bittiming(struct net_device *dev)
 		reg |= FLEXCAN_CTRL_SMP;
 
 	netdev_dbg(dev, "writing ctrl=0x%08x\n", reg);
-	flexcan_write(reg, &regs->ctrl);
+	priv->write(reg, &regs->ctrl);
 
 	/* print chip status */
 	netdev_dbg(dev, "%s: mcr=0x%08x ctrl=0x%08x\n", __func__,
-		   flexcan_read(&regs->mcr), flexcan_read(&regs->ctrl));
+		   priv->read(&regs->mcr), priv->read(&regs->ctrl));
 }
 
 /* flexcan_chip_start
@@ -913,7 +931,7 @@ static int flexcan_chip_start(struct net_device *dev)
 	 * choose format C
 	 * set max mailbox number
 	 */
-	reg_mcr = flexcan_read(&regs->mcr);
+	reg_mcr = priv->read(&regs->mcr);
 	reg_mcr &= ~FLEXCAN_MCR_MAXMB(0xff);
 	reg_mcr |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT | FLEXCAN_MCR_SUPV |
 		FLEXCAN_MCR_WRN_EN | FLEXCAN_MCR_SRX_DIS | FLEXCAN_MCR_IRMQ |
@@ -927,7 +945,7 @@ static int flexcan_chip_start(struct net_device *dev)
 			FLEXCAN_MCR_MAXMB(priv->tx_mb_idx);
 	}
 	netdev_dbg(dev, "%s: writing mcr=0x%08x", __func__, reg_mcr);
-	flexcan_write(reg_mcr, &regs->mcr);
+	priv->write(reg_mcr, &regs->mcr);
 
 	/* CTRL
 	 *
@@ -940,7 +958,7 @@ static int flexcan_chip_start(struct net_device *dev)
 	 * enable bus off interrupt
 	 * (== FLEXCAN_CTRL_ERR_STATE)
 	 */
-	reg_ctrl = flexcan_read(&regs->ctrl);
+	reg_ctrl = priv->read(&regs->ctrl);
 	reg_ctrl &= ~FLEXCAN_CTRL_TSYN;
 	reg_ctrl |= FLEXCAN_CTRL_BOFF_REC | FLEXCAN_CTRL_LBUF |
 		FLEXCAN_CTRL_ERR_STATE;
@@ -960,45 +978,45 @@ static int flexcan_chip_start(struct net_device *dev)
 	/* leave interrupts disabled for now */
 	reg_ctrl &= ~FLEXCAN_CTRL_ERR_ALL;
 	netdev_dbg(dev, "%s: writing ctrl=0x%08x", __func__, reg_ctrl);
-	flexcan_write(reg_ctrl, &regs->ctrl);
+	priv->write(reg_ctrl, &regs->ctrl);
 
 	if ((priv->devtype_data->quirks & FLEXCAN_QUIRK_ENABLE_EACEN_RRS)) {
-		reg_ctrl2 = flexcan_read(&regs->ctrl2);
+		reg_ctrl2 = priv->read(&regs->ctrl2);
 		reg_ctrl2 |= FLEXCAN_CTRL2_EACEN | FLEXCAN_CTRL2_RRS;
-		flexcan_write(reg_ctrl2, &regs->ctrl2);
+		priv->write(reg_ctrl2, &regs->ctrl2);
 	}
 
 	/* clear and invalidate all mailboxes first */
 	for (i = priv->tx_mb_idx; i < ARRAY_SIZE(regs->mb); i++) {
-		flexcan_write(FLEXCAN_MB_CODE_RX_INACTIVE,
-			      &regs->mb[i].can_ctrl);
+		priv->write(FLEXCAN_MB_CODE_RX_INACTIVE,
+			    &regs->mb[i].can_ctrl);
 	}
 
 	if (priv->devtype_data->quirks & FLEXCAN_QUIRK_USE_OFF_TIMESTAMP) {
 		for (i = priv->offload.mb_first; i <= priv->offload.mb_last; i++)
-			flexcan_write(FLEXCAN_MB_CODE_RX_EMPTY,
-				      &regs->mb[i].can_ctrl);
+			priv->write(FLEXCAN_MB_CODE_RX_EMPTY,
+				    &regs->mb[i].can_ctrl);
 	}
 
 	/* Errata ERR005829: mark first TX mailbox as INACTIVE */
-	flexcan_write(FLEXCAN_MB_CODE_TX_INACTIVE,
-		      &priv->tx_mb_reserved->can_ctrl);
+	priv->write(FLEXCAN_MB_CODE_TX_INACTIVE,
+		    &priv->tx_mb_reserved->can_ctrl);
 
 	/* mark TX mailbox as INACTIVE */
-	flexcan_write(FLEXCAN_MB_CODE_TX_INACTIVE,
-		      &priv->tx_mb->can_ctrl);
+	priv->write(FLEXCAN_MB_CODE_TX_INACTIVE,
+		    &priv->tx_mb->can_ctrl);
 
 	/* acceptance mask/acceptance code (accept everything) */
-	flexcan_write(0x0, &regs->rxgmask);
-	flexcan_write(0x0, &regs->rx14mask);
-	flexcan_write(0x0, &regs->rx15mask);
+	priv->write(0x0, &regs->rxgmask);
+	priv->write(0x0, &regs->rx14mask);
+	priv->write(0x0, &regs->rx15mask);
 
 	if (priv->devtype_data->quirks & FLEXCAN_QUIRK_DISABLE_RXFG)
-		flexcan_write(0x0, &regs->rxfgmask);
+		priv->write(0x0, &regs->rxfgmask);
 
 	/* clear acceptance filters */
 	for (i = 0; i < ARRAY_SIZE(regs->mb); i++)
-		flexcan_write(0, &regs->rximr[i]);
+		priv->write(0, &regs->rximr[i]);
 
 	/* On Vybrid, disable memory error detection interrupts
 	 * and freeze mode.
@@ -1011,16 +1029,16 @@ static int flexcan_chip_start(struct net_device *dev)
 		 * and Correction of Memory Errors" to write to
 		 * MECR register
 		 */
-		reg_ctrl2 = flexcan_read(&regs->ctrl2);
+		reg_ctrl2 = priv->read(&regs->ctrl2);
 		reg_ctrl2 |= FLEXCAN_CTRL2_ECRWRE;
-		flexcan_write(reg_ctrl2, &regs->ctrl2);
+		priv->write(reg_ctrl2, &regs->ctrl2);
 
-		reg_mecr = flexcan_read(&regs->mecr);
+		reg_mecr = priv->read(&regs->mecr);
 		reg_mecr &= ~FLEXCAN_MECR_ECRWRDIS;
-		flexcan_write(reg_mecr, &regs->mecr);
+		priv->write(reg_mecr, &regs->mecr);
 		reg_mecr &= ~(FLEXCAN_MECR_NCEFAFRZ | FLEXCAN_MECR_HANCEI_MSK |
 			      FLEXCAN_MECR_FANCEI_MSK);
-		flexcan_write(reg_mecr, &regs->mecr);
+		priv->write(reg_mecr, &regs->mecr);
 	}
 
 	err = flexcan_transceiver_enable(priv);
@@ -1036,14 +1054,14 @@ static int flexcan_chip_start(struct net_device *dev)
 
 	/* enable interrupts atomically */
 	disable_irq(dev->irq);
-	flexcan_write(priv->reg_ctrl_default, &regs->ctrl);
-	flexcan_write(priv->reg_imask1_default, &regs->imask1);
-	flexcan_write(priv->reg_imask2_default, &regs->imask2);
+	priv->write(priv->reg_ctrl_default, &regs->ctrl);
+	priv->write(priv->reg_imask1_default, &regs->imask1);
+	priv->write(priv->reg_imask2_default, &regs->imask2);
 	enable_irq(dev->irq);
 
 	/* print chip status */
 	netdev_dbg(dev, "%s: reading mcr=0x%08x ctrl=0x%08x\n", __func__,
-		   flexcan_read(&regs->mcr), flexcan_read(&regs->ctrl));
+		   priv->read(&regs->mcr), priv->read(&regs->ctrl));
 
 	return 0;
 
@@ -1068,10 +1086,10 @@ static void flexcan_chip_stop(struct net_device *dev)
 	flexcan_chip_disable(priv);
 
 	/* Disable all interrupts */
-	flexcan_write(0, &regs->imask2);
-	flexcan_write(0, &regs->imask1);
-	flexcan_write(priv->reg_ctrl_default & ~FLEXCAN_CTRL_ERR_ALL,
-		      &regs->ctrl);
+	priv->write(0, &regs->imask2);
+	priv->write(0, &regs->imask1);
+	priv->write(priv->reg_ctrl_default & ~FLEXCAN_CTRL_ERR_ALL,
+		    &regs->ctrl);
 
 	flexcan_transceiver_disable(priv);
 	priv->can.state = CAN_STATE_STOPPED;
@@ -1186,26 +1204,26 @@ static int register_flexcandev(struct net_device *dev)
 	err = flexcan_chip_disable(priv);
 	if (err)
 		goto out_disable_per;
-	reg = flexcan_read(&regs->ctrl);
+	reg = priv->read(&regs->ctrl);
 	reg |= FLEXCAN_CTRL_CLK_SRC;
-	flexcan_write(reg, &regs->ctrl);
+	priv->write(reg, &regs->ctrl);
 
 	err = flexcan_chip_enable(priv);
 	if (err)
 		goto out_chip_disable;
 
 	/* set freeze, halt and activate FIFO, restrict register access */
-	reg = flexcan_read(&regs->mcr);
+	reg = priv->read(&regs->mcr);
 	reg |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT |
 		FLEXCAN_MCR_FEN | FLEXCAN_MCR_SUPV;
-	flexcan_write(reg, &regs->mcr);
+	priv->write(reg, &regs->mcr);
 
 	/* Currently we only support newer versions of this core
 	 * featuring a RX hardware FIFO (although this driver doesn't
 	 * make use of it on some cores). Older cores, found on some
 	 * Coldfire derivates are not tested.
 	 */
-	reg = flexcan_read(&regs->mcr);
+	reg = priv->read(&regs->mcr);
 	if (!(reg & FLEXCAN_MCR_FEN)) {
 		netdev_err(dev, "Could not enable RX FIFO, unsupported core\n");
 		err = -ENODEV;
@@ -1233,8 +1251,12 @@ static void unregister_flexcandev(struct net_device *dev)
 static const struct of_device_id flexcan_of_match[] = {
 	{ .compatible = "fsl,imx6q-flexcan", .data = &fsl_imx6q_devtype_data, },
 	{ .compatible = "fsl,imx28-flexcan", .data = &fsl_imx28_devtype_data, },
+	{ .compatible = "fsl,imx53-flexcan", .data = &fsl_p1010_devtype_data, },
+	{ .compatible = "fsl,imx35-flexcan", .data = &fsl_p1010_devtype_data, },
+	{ .compatible = "fsl,imx25-flexcan", .data = &fsl_p1010_devtype_data, },
 	{ .compatible = "fsl,p1010-flexcan", .data = &fsl_p1010_devtype_data, },
 	{ .compatible = "fsl,vf610-flexcan", .data = &fsl_vf610_devtype_data, },
+	{ .compatible = "fsl,ls1021ar2-flexcan", .data = &fsl_ls1021a_r2_devtype_data, },
 	{ /* sentinel */ },
 };
 MODULE_DEVICE_TABLE(of, flexcan_of_match);
@@ -1314,6 +1336,21 @@ static int flexcan_probe(struct platform_device *pdev)
 	dev->flags |= IFF_ECHO;
 
 	priv = netdev_priv(dev);
+
+	if (of_property_read_bool(pdev->dev.of_node, "big-endian")) {
+		priv->read = flexcan_read_be;
+		priv->write = flexcan_write_be;
+	} else {
+		if (of_device_is_compatible(pdev->dev.of_node,
+					    "fsl,p1010-flexcan")) {
+			priv->read = flexcan_read_be;
+			priv->write = flexcan_write_be;
+		} else {
+			priv->read = flexcan_read_le;
+			priv->write = flexcan_write_le;
+		}
+	}
+
 	priv->can.clock.freq = clock_freq;
 	priv->can.bittiming_const = &flexcan_bittiming_const;
 	priv->can.do_set_mode = flexcan_set_mode;
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb.c b/drivers/net/can/usb/peak_usb/pcan_usb.c
index 25a9b79cc42d..f530a80f5051 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb.c
@@ -408,7 +408,6 @@ static int pcan_usb_decode_error(struct pcan_usb_msg_context *mc, u8 n,
 {
 	struct sk_buff *skb;
 	struct can_frame *cf;
-	struct timeval tv;
 	enum can_state new_state;
 
 	/* ignore this error until 1st ts received */
@@ -525,8 +524,8 @@ static int pcan_usb_decode_error(struct pcan_usb_msg_context *mc, u8 n,
 	if (status_len & PCAN_USB_STATUSLEN_TIMESTAMP) {
 		struct skb_shared_hwtstamps *hwts = skb_hwtstamps(skb);
 
-		peak_usb_get_ts_tv(&mc->pdev->time_ref, mc->ts16, &tv);
-		hwts->hwtstamp = timeval_to_ktime(tv);
+		peak_usb_get_ts_time(&mc->pdev->time_ref, mc->ts16,
+				     &hwts->hwtstamp);
 	}
 
 	mc->netdev->stats.rx_packets++;
@@ -610,7 +609,6 @@ static int pcan_usb_decode_data(struct pcan_usb_msg_context *mc, u8 status_len)
 	u8 rec_len = status_len & PCAN_USB_STATUSLEN_DLC;
 	struct sk_buff *skb;
 	struct can_frame *cf;
-	struct timeval tv;
 	struct skb_shared_hwtstamps *hwts;
 
 	skb = alloc_can_skb(mc->netdev, &cf);
@@ -658,9 +656,8 @@ static int pcan_usb_decode_data(struct pcan_usb_msg_context *mc, u8 status_len)
 	}
 
 	/* convert timestamp into kernel time */
-	peak_usb_get_ts_tv(&mc->pdev->time_ref, mc->ts16, &tv);
 	hwts = skb_hwtstamps(skb);
-	hwts->hwtstamp = timeval_to_ktime(tv);
+	peak_usb_get_ts_time(&mc->pdev->time_ref, mc->ts16, &hwts->hwtstamp);
 
 	/* update statistics */
 	mc->netdev->stats.rx_packets++;
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_core.c b/drivers/net/can/usb/peak_usb/pcan_usb_core.c
index 1ca76e03e965..8f699ee6a528 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb_core.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.c
@@ -80,21 +80,6 @@ void peak_usb_init_time_ref(struct peak_time_ref *time_ref,
 	}
 }
 
-static void peak_usb_add_us(struct timeval *tv, u32 delta_us)
-{
-	/* number of s. to add to final time */
-	u32 delta_s = delta_us / 1000000;
-
-	delta_us -= delta_s * 1000000;
-
-	tv->tv_usec += delta_us;
-	if (tv->tv_usec >= 1000000) {
-		tv->tv_usec -= 1000000;
-		delta_s++;
-	}
-	tv->tv_sec += delta_s;
-}
-
 /*
  * sometimes, another now may be  more recent than current one...
  */
@@ -103,7 +88,7 @@ void peak_usb_update_ts_now(struct peak_time_ref *time_ref, u32 ts_now)
 	time_ref->ts_dev_2 = ts_now;
 
 	/* should wait at least two passes before computing */
-	if (time_ref->tv_host.tv_sec > 0) {
+	if (ktime_to_ns(time_ref->tv_host) > 0) {
 		u32 delta_ts = time_ref->ts_dev_2 - time_ref->ts_dev_1;
 
 		if (time_ref->ts_dev_2 < time_ref->ts_dev_1)
@@ -118,26 +103,26 @@ void peak_usb_update_ts_now(struct peak_time_ref *time_ref, u32 ts_now)
  */
 void peak_usb_set_ts_now(struct peak_time_ref *time_ref, u32 ts_now)
 {
-	if (time_ref->tv_host_0.tv_sec == 0) {
+	if (ktime_to_ns(time_ref->tv_host_0) == 0) {
 		/* use monotonic clock to correctly compute further deltas */
-		time_ref->tv_host_0 = ktime_to_timeval(ktime_get());
-		time_ref->tv_host.tv_sec = 0;
+		time_ref->tv_host_0 = ktime_get();
+		time_ref->tv_host = ktime_set(0, 0);
 	} else {
 		/*
-		 * delta_us should not be >= 2^32 => delta_s should be < 4294
+		 * delta_us should not be >= 2^32 => delta should be < 4294s
 		 * handle 32-bits wrapping here: if count of s. reaches 4200,
 		 * reset counters and change time base
 		 */
-		if (time_ref->tv_host.tv_sec != 0) {
-			u32 delta_s = time_ref->tv_host.tv_sec
-						- time_ref->tv_host_0.tv_sec;
-			if (delta_s > 4200) {
+		if (ktime_to_ns(time_ref->tv_host)) {
+			ktime_t delta = ktime_sub(time_ref->tv_host,
+						  time_ref->tv_host_0);
+			if (ktime_to_ns(delta) > (4200ull * NSEC_PER_SEC)) {
 				time_ref->tv_host_0 = time_ref->tv_host;
 				time_ref->ts_total = 0;
 			}
 		}
 
-		time_ref->tv_host = ktime_to_timeval(ktime_get());
+		time_ref->tv_host = ktime_get();
 		time_ref->tick_count++;
 	}
 
@@ -146,13 +131,12 @@ void peak_usb_set_ts_now(struct peak_time_ref *time_ref, u32 ts_now)
 }
 
 /*
- * compute timeval according to current ts and time_ref data
+ * compute time according to current ts and time_ref data
  */
-void peak_usb_get_ts_tv(struct peak_time_ref *time_ref, u32 ts,
-			struct timeval *tv)
+void peak_usb_get_ts_time(struct peak_time_ref *time_ref, u32 ts, ktime_t *time)
 {
-	/* protect from getting timeval before setting now */
-	if (time_ref->tv_host.tv_sec > 0) {
+	/* protect from getting time before setting now */
+	if (ktime_to_ns(time_ref->tv_host)) {
 		u64 delta_us;
 
 		delta_us = ts - time_ref->ts_dev_2;
@@ -164,10 +148,9 @@ void peak_usb_get_ts_tv(struct peak_time_ref *time_ref, u32 ts,
 		delta_us *= time_ref->adapter->us_per_ts_scale;
 		delta_us >>= time_ref->adapter->us_per_ts_shift;
 
-		*tv = time_ref->tv_host_0;
-		peak_usb_add_us(tv, (u32)delta_us);
+		*time = ktime_add_us(time_ref->tv_host_0, delta_us);
 	} else {
-		*tv = ktime_to_timeval(ktime_get());
+		*time = ktime_get();
 	}
 }
 
@@ -178,10 +161,8 @@ int peak_usb_netif_rx(struct sk_buff *skb,
 		      struct peak_time_ref *time_ref, u32 ts_low, u32 ts_high)
 {
 	struct skb_shared_hwtstamps *hwts = skb_hwtstamps(skb);
-	struct timeval tv;
 
-	peak_usb_get_ts_tv(time_ref, ts_low, &tv);
-	hwts->hwtstamp = timeval_to_ktime(tv);
+	peak_usb_get_ts_time(time_ref, ts_low, &hwts->hwtstamp);
 
 	return netif_rx(skb);
 }
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_core.h b/drivers/net/can/usb/peak_usb/pcan_usb_core.h
index c01316cac354..29f03dccca10 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb_core.h
+++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.h
@@ -96,7 +96,7 @@ extern const struct peak_usb_adapter pcan_usb_pro_fd;
 extern const struct peak_usb_adapter pcan_usb_x6;
 
 struct peak_time_ref {
-	struct timeval tv_host_0, tv_host;
+	ktime_t tv_host_0, tv_host;
 	u32 ts_dev_1, ts_dev_2;
 	u64 ts_total;
 	u32 tick_count;
@@ -151,8 +151,7 @@ void peak_usb_init_time_ref(struct peak_time_ref *time_ref,
 			    const struct peak_usb_adapter *adapter);
 void peak_usb_update_ts_now(struct peak_time_ref *time_ref, u32 ts_now);
 void peak_usb_set_ts_now(struct peak_time_ref *time_ref, u32 ts_now);
-void peak_usb_get_ts_tv(struct peak_time_ref *time_ref, u32 ts,
-			struct timeval *tv);
+void peak_usb_get_ts_time(struct peak_time_ref *time_ref, u32 ts, ktime_t *tv);
 int peak_usb_netif_rx(struct sk_buff *skb,
 		      struct peak_time_ref *time_ref, u32 ts_low, u32 ts_high);
 void peak_usb_async_complete(struct urb *urb);
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
index bbdd6058cd2f..0105fbfea273 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
@@ -531,7 +531,6 @@ static int pcan_usb_pro_handle_canmsg(struct pcan_usb_pro_interface *usb_if,
 	struct net_device *netdev = dev->netdev;
 	struct can_frame *can_frame;
 	struct sk_buff *skb;
-	struct timeval tv;
 	struct skb_shared_hwtstamps *hwts;
 
 	skb = alloc_can_skb(netdev, &can_frame);
@@ -549,9 +548,9 @@ static int pcan_usb_pro_handle_canmsg(struct pcan_usb_pro_interface *usb_if,
 	else
 		memcpy(can_frame->data, rx->data, can_frame->can_dlc);
 
-	peak_usb_get_ts_tv(&usb_if->time_ref, le32_to_cpu(rx->ts32), &tv);
 	hwts = skb_hwtstamps(skb);
-	hwts->hwtstamp = timeval_to_ktime(tv);
+	peak_usb_get_ts_time(&usb_if->time_ref, le32_to_cpu(rx->ts32),
+			     &hwts->hwtstamp);
 
 	netdev->stats.rx_packets++;
 	netdev->stats.rx_bytes += can_frame->can_dlc;
@@ -571,7 +570,6 @@ static int pcan_usb_pro_handle_error(struct pcan_usb_pro_interface *usb_if,
 	enum can_state new_state = CAN_STATE_ERROR_ACTIVE;
 	u8 err_mask = 0;
 	struct sk_buff *skb;
-	struct timeval tv;
 	struct skb_shared_hwtstamps *hwts;
 
 	/* nothing should be sent while in BUS_OFF state */
@@ -667,9 +665,8 @@ static int pcan_usb_pro_handle_error(struct pcan_usb_pro_interface *usb_if,
 
 	dev->can.state = new_state;
 
-	peak_usb_get_ts_tv(&usb_if->time_ref, le32_to_cpu(er->ts32), &tv);
 	hwts = skb_hwtstamps(skb);
-	hwts->hwtstamp = timeval_to_ktime(tv);
+	peak_usb_get_ts_time(&usb_if->time_ref, le32_to_cpu(er->ts32), &hwts->hwtstamp);
 	netdev->stats.rx_packets++;
 	netdev->stats.rx_bytes += can_frame->can_dlc;
 	netif_rx(skb);
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index 8404e8852a0f..5d1753cfacea 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -227,10 +227,8 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
 	netif_carrier_off(peer);
 
 	err = rtnl_configure_link(peer, ifmp);
-	if (err < 0) {
-		unregister_netdevice(peer);
-		return err;
-	}
+	if (err < 0)
+		goto unregister_network_device;
 
 	/* register first device */
 	if (tb[IFLA_IFNAME])
@@ -239,10 +237,8 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
 		snprintf(dev->name, IFNAMSIZ, DRV_NAME "%%d");
 
 	err = register_netdevice(dev);
-	if (err < 0) {
-		unregister_netdevice(peer);
-		return err;
-	}
+	if (err < 0)
+		goto unregister_network_device;
 
 	netif_carrier_off(dev);
 
@@ -254,6 +250,10 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
 	rcu_assign_pointer(priv->peer, dev);
 
 	return 0;
+
+unregister_network_device:
+	unregister_netdevice(peer);
+	return err;
 }
 
 static void vxcan_dellink(struct net_device *dev, struct list_head *head)
diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 83a9bc892a3b..2b81b97e994f 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -33,7 +33,7 @@ config NET_DSA_MT7530
 
 config NET_DSA_MV88E6060
 	tristate "Marvell 88E6060 ethernet switch chip support"
-	depends on NET_DSA
+	depends on NET_DSA && NET_DSA_LEGACY
 	select NET_DSA_TAG_TRAILER
 	---help---
 	  This enables support for the Marvell 88E6060 ethernet switch
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index f5a8dd96fd75..561b05089cb6 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1029,8 +1029,7 @@ int b53_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering)
 EXPORT_SYMBOL(b53_vlan_filtering);
 
 int b53_vlan_prepare(struct dsa_switch *ds, int port,
-		     const struct switchdev_obj_port_vlan *vlan,
-		     struct switchdev_trans *trans)
+		     const struct switchdev_obj_port_vlan *vlan)
 {
 	struct b53_device *dev = ds->priv;
 
@@ -1047,8 +1046,7 @@ int b53_vlan_prepare(struct dsa_switch *ds, int port,
 EXPORT_SYMBOL(b53_vlan_prepare);
 
 void b53_vlan_add(struct dsa_switch *ds, int port,
-		  const struct switchdev_obj_port_vlan *vlan,
-		  struct switchdev_trans *trans)
+		  const struct switchdev_obj_port_vlan *vlan)
 {
 	struct b53_device *dev = ds->priv;
 	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
@@ -1495,8 +1493,7 @@ static bool b53_can_enable_brcm_tags(struct dsa_switch *ds, int port)
 	return false;
 }
 
-static enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds,
-						  int port)
+enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port)
 {
 	struct b53_device *dev = ds->priv;
 
@@ -1514,6 +1511,7 @@ static enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds,
 
 	return DSA_TAG_PROTO_BRCM;
 }
+EXPORT_SYMBOL(b53_get_tag_protocol);
 
 int b53_mirror_add(struct dsa_switch *ds, int port,
 		   struct dsa_mall_mirror_tc_entry *mirror, bool ingress)
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index daaaa1ecb996..d954cf36ecd8 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -295,11 +295,9 @@ void b53_br_set_stp_state(struct dsa_switch *ds, int port, u8 state);
 void b53_br_fast_age(struct dsa_switch *ds, int port);
 int b53_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering);
 int b53_vlan_prepare(struct dsa_switch *ds, int port,
-		     const struct switchdev_obj_port_vlan *vlan,
-		     struct switchdev_trans *trans);
+		     const struct switchdev_obj_port_vlan *vlan);
 void b53_vlan_add(struct dsa_switch *ds, int port,
-		  const struct switchdev_obj_port_vlan *vlan,
-		  struct switchdev_trans *trans);
+		  const struct switchdev_obj_port_vlan *vlan);
 int b53_vlan_del(struct dsa_switch *ds, int port,
 		 const struct switchdev_obj_port_vlan *vlan);
 int b53_fdb_add(struct dsa_switch *ds, int port,
@@ -310,6 +308,7 @@ int b53_fdb_dump(struct dsa_switch *ds, int port,
 		 dsa_fdb_dump_cb_t *cb, void *data);
 int b53_mirror_add(struct dsa_switch *ds, int port,
 		   struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
+enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port);
 void b53_mirror_del(struct dsa_switch *ds, int port,
 		    struct dsa_mall_mirror_tc_entry *mirror);
 int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy);
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index b62d47210db8..8ea0cd494ea0 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -34,12 +34,6 @@
 #include "b53/b53_priv.h"
 #include "b53/b53_regs.h"
 
-static enum dsa_tag_protocol bcm_sf2_sw_get_tag_protocol(struct dsa_switch *ds,
-							 int port)
-{
-	return DSA_TAG_PROTO_BRCM;
-}
-
 static void bcm_sf2_imp_setup(struct dsa_switch *ds, int port)
 {
 	struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
@@ -860,7 +854,7 @@ static const struct b53_io_ops bcm_sf2_io_ops = {
 };
 
 static const struct dsa_switch_ops bcm_sf2_ops = {
-	.get_tag_protocol	= bcm_sf2_sw_get_tag_protocol,
+	.get_tag_protocol	= b53_get_tag_protocol,
 	.setup			= bcm_sf2_sw_setup,
 	.get_strings		= b53_get_strings,
 	.get_ethtool_stats	= b53_get_ethtool_stats,
diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
index bb71d3d6f65b..7aa84ee4e771 100644
--- a/drivers/net/dsa/dsa_loop.c
+++ b/drivers/net/dsa/dsa_loop.c
@@ -174,9 +174,9 @@ static int dsa_loop_port_vlan_filtering(struct dsa_switch *ds, int port,
 	return 0;
 }
 
-static int dsa_loop_port_vlan_prepare(struct dsa_switch *ds, int port,
-				      const struct switchdev_obj_port_vlan *vlan,
-				      struct switchdev_trans *trans)
+static int
+dsa_loop_port_vlan_prepare(struct dsa_switch *ds, int port,
+			   const struct switchdev_obj_port_vlan *vlan)
 {
 	struct dsa_loop_priv *ps = ds->priv;
 	struct mii_bus *bus = ps->bus;
@@ -193,8 +193,7 @@ static int dsa_loop_port_vlan_prepare(struct dsa_switch *ds, int port,
 }
 
 static void dsa_loop_port_vlan_add(struct dsa_switch *ds, int port,
-				   const struct switchdev_obj_port_vlan *vlan,
-				   struct switchdev_trans *trans)
+				   const struct switchdev_obj_port_vlan *vlan)
 {
 	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
 	bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index b24566bb74d2..c1b004fa64d9 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -583,6 +583,7 @@ static void lan9303_alr_loop(struct lan9303 *chip, alr_loop_cb_t *cb, void *ctx)
 {
 	int i;
 
+	mutex_lock(&chip->alr_mutex);
 	lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD,
 				 LAN9303_ALR_CMD_GET_FIRST);
 	lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, 0);
@@ -606,6 +607,7 @@ static void lan9303_alr_loop(struct lan9303 *chip, alr_loop_cb_t *cb, void *ctx)
 					 LAN9303_ALR_CMD_GET_NEXT);
 		lan9303_write_switch_reg(chip, LAN9303_SWE_ALR_CMD, 0);
 	}
+	mutex_unlock(&chip->alr_mutex);
 }
 
 static void alr_reg_to_mac(u32 dat0, u32 dat1, u8 mac[6])
@@ -694,16 +696,20 @@ static int lan9303_alr_add_port(struct lan9303 *chip, const u8 *mac, int port,
 {
 	struct lan9303_alr_cache_entry *entr;
 
+	mutex_lock(&chip->alr_mutex);
 	entr = lan9303_alr_cache_find_mac(chip, mac);
 	if (!entr) { /*New entry */
 		entr = lan9303_alr_cache_find_free(chip);
-		if (!entr)
+		if (!entr) {
+			mutex_unlock(&chip->alr_mutex);
 			return -ENOSPC;
+		}
 		ether_addr_copy(entr->mac_addr, mac);
 	}
 	entr->port_map |= BIT(port);
 	entr->stp_override = stp_override;
 	lan9303_alr_set_entry(chip, mac, entr->port_map, stp_override);
+	mutex_unlock(&chip->alr_mutex);
 
 	return 0;
 }
@@ -713,15 +719,18 @@ static int lan9303_alr_del_port(struct lan9303 *chip, const u8 *mac, int port)
 {
 	struct lan9303_alr_cache_entry *entr;
 
+	mutex_lock(&chip->alr_mutex);
 	entr = lan9303_alr_cache_find_mac(chip, mac);
 	if (!entr)
-		return 0;  /* no static entry found */
+		goto out;  /* no static entry found */
 
 	entr->port_map &= ~BIT(port);
 	if (entr->port_map == 0) /* zero means its free again */
 		eth_zero_addr(entr->mac_addr);
 	lan9303_alr_set_entry(chip, mac, entr->port_map, entr->stp_override);
 
+out:
+	mutex_unlock(&chip->alr_mutex);
 	return 0;
 }
 
@@ -1217,8 +1226,7 @@ static int lan9303_port_fdb_dump(struct dsa_switch *ds, int port,
 }
 
 static int lan9303_port_mdb_prepare(struct dsa_switch *ds, int port,
-				    const struct switchdev_obj_port_mdb *mdb,
-				    struct switchdev_trans *trans)
+				    const struct switchdev_obj_port_mdb *mdb)
 {
 	struct lan9303 *chip = ds->priv;
 
@@ -1235,8 +1243,7 @@ static int lan9303_port_mdb_prepare(struct dsa_switch *ds, int port,
 }
 
 static void lan9303_port_mdb_add(struct dsa_switch *ds, int port,
-				 const struct switchdev_obj_port_mdb *mdb,
-				 struct switchdev_trans *trans)
+				 const struct switchdev_obj_port_mdb *mdb)
 {
 	struct lan9303 *chip = ds->priv;
 
@@ -1325,6 +1332,7 @@ int lan9303_probe(struct lan9303 *chip, struct device_node *np)
 	int ret;
 
 	mutex_init(&chip->indirect_mutex);
+	mutex_init(&chip->alr_mutex);
 
 	lan9303_probe_reset_gpio(chip, np);
 
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index b5be93a1e0df..663b0d5b982b 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -559,8 +559,7 @@ static int ksz_port_vlan_filtering(struct dsa_switch *ds, int port, bool flag)
 }
 
 static int ksz_port_vlan_prepare(struct dsa_switch *ds, int port,
-				 const struct switchdev_obj_port_vlan *vlan,
-				 struct switchdev_trans *trans)
+				 const struct switchdev_obj_port_vlan *vlan)
 {
 	/* nothing needed */
 
@@ -568,8 +567,7 @@ static int ksz_port_vlan_prepare(struct dsa_switch *ds, int port,
 }
 
 static void ksz_port_vlan_add(struct dsa_switch *ds, int port,
-			      const struct switchdev_obj_port_vlan *vlan,
-			      struct switchdev_trans *trans)
+			      const struct switchdev_obj_port_vlan *vlan)
 {
 	struct ksz_device *dev = ds->priv;
 	u32 vlan_table[3];
@@ -858,16 +856,14 @@ exit:
 }
 
 static int ksz_port_mdb_prepare(struct dsa_switch *ds, int port,
-				const struct switchdev_obj_port_mdb *mdb,
-				struct switchdev_trans *trans)
+				const struct switchdev_obj_port_mdb *mdb)
 {
 	/* nothing to do */
 	return 0;
 }
 
 static void ksz_port_mdb_add(struct dsa_switch *ds, int port,
-			     const struct switchdev_obj_port_mdb *mdb,
-			     struct switchdev_trans *trans)
+			     const struct switchdev_obj_port_mdb *mdb)
 {
 	struct ksz_device *dev = ds->priv;
 	u32 static_table[4];
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 66d33e97cbc5..fc512c98f2f8 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -1185,8 +1185,7 @@ static int mv88e6xxx_port_vlan_filtering(struct dsa_switch *ds, int port,
 
 static int
 mv88e6xxx_port_vlan_prepare(struct dsa_switch *ds, int port,
-			    const struct switchdev_obj_port_vlan *vlan,
-			    struct switchdev_trans *trans)
+			    const struct switchdev_obj_port_vlan *vlan)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
 	int err;
@@ -1295,8 +1294,7 @@ static int _mv88e6xxx_port_vlan_add(struct mv88e6xxx_chip *chip, int port,
 }
 
 static void mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port,
-				    const struct switchdev_obj_port_vlan *vlan,
-				    struct switchdev_trans *trans)
+				    const struct switchdev_obj_port_vlan *vlan)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
 	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
@@ -1725,9 +1723,11 @@ static int mv88e6xxx_setup_message_port(struct mv88e6xxx_chip *chip, int port)
 
 static int mv88e6xxx_setup_egress_floods(struct mv88e6xxx_chip *chip, int port)
 {
-	bool flood = port == dsa_upstream_port(chip->ds);
+	struct dsa_switch *ds = chip->ds;
+	bool flood;
 
 	/* Upstream ports flood frames with unknown unicast or multicast DA */
+	flood = dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port);
 	if (chip->info->ops->port_set_egress_floods)
 		return chip->info->ops->port_set_egress_floods(chip, port,
 							       flood, flood);
@@ -1744,6 +1744,39 @@ static int mv88e6xxx_serdes_power(struct mv88e6xxx_chip *chip, int port,
 	return 0;
 }
 
+static int mv88e6xxx_setup_upstream_port(struct mv88e6xxx_chip *chip, int port)
+{
+	struct dsa_switch *ds = chip->ds;
+	int upstream_port;
+	int err;
+
+	upstream_port = dsa_upstream_port(ds, port);
+	if (chip->info->ops->port_set_upstream_port) {
+		err = chip->info->ops->port_set_upstream_port(chip, port,
+							      upstream_port);
+		if (err)
+			return err;
+	}
+
+	if (port == upstream_port) {
+		if (chip->info->ops->set_cpu_port) {
+			err = chip->info->ops->set_cpu_port(chip,
+							    upstream_port);
+			if (err)
+				return err;
+		}
+
+		if (chip->info->ops->set_egress_port) {
+			err = chip->info->ops->set_egress_port(chip,
+							       upstream_port);
+			if (err)
+				return err;
+		}
+	}
+
+	return 0;
+}
+
 static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 {
 	struct dsa_switch *ds = chip->ds;
@@ -1814,13 +1847,9 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 	if (err)
 		return err;
 
-	reg = 0;
-	if (chip->info->ops->port_set_upstream_port) {
-		err = chip->info->ops->port_set_upstream_port(
-			chip, port, dsa_upstream_port(ds));
-		if (err)
-			return err;
-	}
+	err = mv88e6xxx_setup_upstream_port(chip, port);
+	if (err)
+		return err;
 
 	err = mv88e6xxx_port_set_8021q_mode(chip, port,
 				MV88E6XXX_PORT_CTL2_8021Q_MODE_DISABLED);
@@ -1946,21 +1975,8 @@ static int mv88e6xxx_set_ageing_time(struct dsa_switch *ds,
 static int mv88e6xxx_g1_setup(struct mv88e6xxx_chip *chip)
 {
 	struct dsa_switch *ds = chip->ds;
-	u32 upstream_port = dsa_upstream_port(ds);
 	int err;
 
-	if (chip->info->ops->set_cpu_port) {
-		err = chip->info->ops->set_cpu_port(chip, upstream_port);
-		if (err)
-			return err;
-	}
-
-	if (chip->info->ops->set_egress_port) {
-		err = chip->info->ops->set_egress_port(chip, upstream_port);
-		if (err)
-			return err;
-	}
-
 	/* Disable remote management, and set the switch's DSA device number. */
 	err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_CTL2,
 				 MV88E6XXX_G1_CTL2_MULTIPLE_CASCADE |
@@ -3741,6 +3757,7 @@ static enum dsa_tag_protocol mv88e6xxx_get_tag_protocol(struct dsa_switch *ds,
 	return chip->info->tag_protocol;
 }
 
+#if IS_ENABLED(CONFIG_NET_DSA_LEGACY)
 static const char *mv88e6xxx_drv_probe(struct device *dsa_dev,
 				       struct device *host_dev, int sw_addr,
 				       void **priv)
@@ -3788,10 +3805,10 @@ free:
 
 	return NULL;
 }
+#endif
 
 static int mv88e6xxx_port_mdb_prepare(struct dsa_switch *ds, int port,
-				      const struct switchdev_obj_port_mdb *mdb,
-				      struct switchdev_trans *trans)
+				      const struct switchdev_obj_port_mdb *mdb)
 {
 	/* We don't need any dynamic resource from the kernel (yet),
 	 * so skip the prepare phase.
@@ -3801,8 +3818,7 @@ static int mv88e6xxx_port_mdb_prepare(struct dsa_switch *ds, int port,
 }
 
 static void mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port,
-				   const struct switchdev_obj_port_mdb *mdb,
-				   struct switchdev_trans *trans)
+				   const struct switchdev_obj_port_mdb *mdb)
 {
 	struct mv88e6xxx_chip *chip = ds->priv;
 
@@ -3829,7 +3845,9 @@ static int mv88e6xxx_port_mdb_del(struct dsa_switch *ds, int port,
 }
 
 static const struct dsa_switch_ops mv88e6xxx_switch_ops = {
+#if IS_ENABLED(CONFIG_NET_DSA_LEGACY)
 	.probe			= mv88e6xxx_drv_probe,
+#endif
 	.get_tag_protocol	= mv88e6xxx_get_tag_protocol,
 	.setup			= mv88e6xxx_setup,
 	.adjust_link		= mv88e6xxx_adjust_link,
diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 58483af80bdb..30b1c8512049 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -42,48 +42,7 @@
 #define DRV_NAME	"dummy"
 #define DRV_VERSION	"1.0"
 
-#undef pr_fmt
-#define pr_fmt(fmt) DRV_NAME ": " fmt
-
 static int numdummies = 1;
-static int num_vfs;
-
-struct vf_data_storage {
-	u8	vf_mac[ETH_ALEN];
-	u16	pf_vlan; /* When set, guest VLAN config not allowed. */
-	u16	pf_qos;
-	__be16	vlan_proto;
-	u16	min_tx_rate;
-	u16	max_tx_rate;
-	u8	spoofchk_enabled;
-	bool	rss_query_enabled;
-	u8	trusted;
-	int	link_state;
-};
-
-struct dummy_priv {
-	struct vf_data_storage	*vfinfo;
-};
-
-static int dummy_num_vf(struct device *dev)
-{
-	return num_vfs;
-}
-
-static struct bus_type dummy_bus = {
-	.name	= "dummy",
-	.num_vf	= dummy_num_vf,
-};
-
-static void release_dummy_parent(struct device *dev)
-{
-}
-
-static struct device dummy_parent = {
-	.init_name	= "dummy",
-	.bus		= &dummy_bus,
-	.release	= release_dummy_parent,
-};
 
 /* fake multicast ability */
 static void set_multicast_list(struct net_device *dev)
@@ -133,25 +92,10 @@ static netdev_tx_t dummy_xmit(struct sk_buff *skb, struct net_device *dev)
 
 static int dummy_dev_init(struct net_device *dev)
 {
-	struct dummy_priv *priv = netdev_priv(dev);
-
 	dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
 	if (!dev->dstats)
 		return -ENOMEM;
 
-	priv->vfinfo = NULL;
-
-	if (!num_vfs)
-		return 0;
-
-	dev->dev.parent = &dummy_parent;
-	priv->vfinfo = kcalloc(num_vfs, sizeof(struct vf_data_storage),
-			       GFP_KERNEL);
-	if (!priv->vfinfo) {
-		free_percpu(dev->dstats);
-		return -ENOMEM;
-	}
-
 	return 0;
 }
 
@@ -169,117 +113,6 @@ static int dummy_change_carrier(struct net_device *dev, bool new_carrier)
 	return 0;
 }
 
-static int dummy_set_vf_mac(struct net_device *dev, int vf, u8 *mac)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (!is_valid_ether_addr(mac) || (vf >= num_vfs))
-		return -EINVAL;
-
-	memcpy(priv->vfinfo[vf].vf_mac, mac, ETH_ALEN);
-
-	return 0;
-}
-
-static int dummy_set_vf_vlan(struct net_device *dev, int vf,
-			     u16 vlan, u8 qos, __be16 vlan_proto)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if ((vf >= num_vfs) || (vlan > 4095) || (qos > 7))
-		return -EINVAL;
-
-	priv->vfinfo[vf].pf_vlan = vlan;
-	priv->vfinfo[vf].pf_qos = qos;
-	priv->vfinfo[vf].vlan_proto = vlan_proto;
-
-	return 0;
-}
-
-static int dummy_set_vf_rate(struct net_device *dev, int vf, int min, int max)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].min_tx_rate = min;
-	priv->vfinfo[vf].max_tx_rate = max;
-
-	return 0;
-}
-
-static int dummy_set_vf_spoofchk(struct net_device *dev, int vf, bool val)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].spoofchk_enabled = val;
-
-	return 0;
-}
-
-static int dummy_set_vf_rss_query_en(struct net_device *dev, int vf, bool val)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].rss_query_enabled = val;
-
-	return 0;
-}
-
-static int dummy_set_vf_trust(struct net_device *dev, int vf, bool val)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].trusted = val;
-
-	return 0;
-}
-
-static int dummy_get_vf_config(struct net_device *dev,
-			       int vf, struct ifla_vf_info *ivi)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	ivi->vf = vf;
-	memcpy(&ivi->mac, priv->vfinfo[vf].vf_mac, ETH_ALEN);
-	ivi->vlan = priv->vfinfo[vf].pf_vlan;
-	ivi->qos = priv->vfinfo[vf].pf_qos;
-	ivi->spoofchk = priv->vfinfo[vf].spoofchk_enabled;
-	ivi->linkstate = priv->vfinfo[vf].link_state;
-	ivi->min_tx_rate = priv->vfinfo[vf].min_tx_rate;
-	ivi->max_tx_rate = priv->vfinfo[vf].max_tx_rate;
-	ivi->rss_query_en = priv->vfinfo[vf].rss_query_enabled;
-	ivi->trusted = priv->vfinfo[vf].trusted;
-	ivi->vlan_proto = priv->vfinfo[vf].vlan_proto;
-
-	return 0;
-}
-
-static int dummy_set_vf_link_state(struct net_device *dev, int vf, int state)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	if (vf >= num_vfs)
-		return -EINVAL;
-
-	priv->vfinfo[vf].link_state = state;
-
-	return 0;
-}
-
 static const struct net_device_ops dummy_netdev_ops = {
 	.ndo_init		= dummy_dev_init,
 	.ndo_uninit		= dummy_dev_uninit,
@@ -289,14 +122,6 @@ static const struct net_device_ops dummy_netdev_ops = {
 	.ndo_set_mac_address	= eth_mac_addr,
 	.ndo_get_stats64	= dummy_get_stats64,
 	.ndo_change_carrier	= dummy_change_carrier,
-	.ndo_set_vf_mac		= dummy_set_vf_mac,
-	.ndo_set_vf_vlan	= dummy_set_vf_vlan,
-	.ndo_set_vf_rate	= dummy_set_vf_rate,
-	.ndo_set_vf_spoofchk	= dummy_set_vf_spoofchk,
-	.ndo_set_vf_trust	= dummy_set_vf_trust,
-	.ndo_get_vf_config	= dummy_get_vf_config,
-	.ndo_set_vf_link_state	= dummy_set_vf_link_state,
-	.ndo_set_vf_rss_query_en = dummy_set_vf_rss_query_en,
 };
 
 static void dummy_get_drvinfo(struct net_device *dev,
@@ -323,13 +148,6 @@ static const struct ethtool_ops dummy_ethtool_ops = {
 	.get_ts_info		= dummy_get_ts_info,
 };
 
-static void dummy_free_netdev(struct net_device *dev)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
-
-	kfree(priv->vfinfo);
-}
-
 static void dummy_setup(struct net_device *dev)
 {
 	ether_setup(dev);
@@ -338,7 +156,6 @@ static void dummy_setup(struct net_device *dev)
 	dev->netdev_ops = &dummy_netdev_ops;
 	dev->ethtool_ops = &dummy_ethtool_ops;
 	dev->needs_free_netdev = true;
-	dev->priv_destructor = dummy_free_netdev;
 
 	/* Fill in device structure with ethernet-generic values. */
 	dev->flags |= IFF_NOARP;
@@ -370,7 +187,6 @@ static int dummy_validate(struct nlattr *tb[], struct nlattr *data[],
 
 static struct rtnl_link_ops dummy_link_ops __read_mostly = {
 	.kind		= DRV_NAME,
-	.priv_size	= sizeof(struct dummy_priv),
 	.setup		= dummy_setup,
 	.validate	= dummy_validate,
 };
@@ -379,16 +195,12 @@ static struct rtnl_link_ops dummy_link_ops __read_mostly = {
 module_param(numdummies, int, 0);
 MODULE_PARM_DESC(numdummies, "Number of dummy pseudo devices");
 
-module_param(num_vfs, int, 0);
-MODULE_PARM_DESC(num_vfs, "Number of dummy VFs per dummy device");
-
 static int __init dummy_init_one(void)
 {
 	struct net_device *dev_dummy;
 	int err;
 
-	dev_dummy = alloc_netdev(sizeof(struct dummy_priv),
-				 "dummy%d", NET_NAME_ENUM, dummy_setup);
+	dev_dummy = alloc_netdev(0, "dummy%d", NET_NAME_ENUM, dummy_setup);
 	if (!dev_dummy)
 		return -ENOMEM;
 
@@ -407,21 +219,6 @@ static int __init dummy_init_module(void)
 {
 	int i, err = 0;
 
-	if (num_vfs) {
-		err = bus_register(&dummy_bus);
-		if (err < 0) {
-			pr_err("registering dummy bus failed\n");
-			return err;
-		}
-
-		err = device_register(&dummy_parent);
-		if (err < 0) {
-			pr_err("registering dummy parent device failed\n");
-			bus_unregister(&dummy_bus);
-			return err;
-		}
-	}
-
 	rtnl_lock();
 	err = __rtnl_link_register(&dummy_link_ops);
 	if (err < 0)
@@ -437,22 +234,12 @@ static int __init dummy_init_module(void)
 out:
 	rtnl_unlock();
 
-	if (err && num_vfs) {
-		device_unregister(&dummy_parent);
-		bus_unregister(&dummy_bus);
-	}
-
 	return err;
 }
 
 static void __exit dummy_cleanup_module(void)
 {
 	rtnl_link_unregister(&dummy_link_ops);
-
-	if (num_vfs) {
-		device_unregister(&dummy_parent);
-		bus_unregister(&dummy_bus);
-	}
 }
 
 module_init(dummy_init_module);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 61ca4eb7c6fa..1d865ae201db 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1706,12 +1706,16 @@ static int bnxt_async_event_process(struct bnxt *bp,
 
 		if (BNXT_VF(bp))
 			goto async_event_process_exit;
-		if (data1 & 0x20000) {
+
+		/* print unsupported speed warning in forced speed mode only */
+		if (!(link_info->autoneg & BNXT_AUTONEG_SPEED) &&
+		    (data1 & 0x20000)) {
 			u16 fw_speed = link_info->force_link_speed;
 			u32 speed = bnxt_fw_to_ethtool_speed(fw_speed);
 
-			netdev_warn(bp->dev, "Link speed %d no longer supported\n",
-				    speed);
+			if (speed != SPEED_UNKNOWN)
+				netdev_warn(bp->dev, "Link speed %d no longer supported\n",
+					    speed);
 		}
 		set_bit(BNXT_LINK_SPEED_CHNG_SP_EVENT, &bp->sp_event);
 		/* fall thru */
@@ -7800,8 +7804,6 @@ static void bnxt_remove_one(struct pci_dev *pdev)
 	bnxt_dcb_free(bp);
 	kfree(bp->edev);
 	bp->edev = NULL;
-	if (bp->xdp_prog)
-		bpf_prog_put(bp->xdp_prog);
 	bnxt_cleanup_pci(bp);
 	free_netdev(dev);
 }
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index b13ce5ebde8d..fe7599f404bf 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -1376,6 +1376,9 @@ static int bnxt_firmware_reset(struct net_device *dev,
 		req.embedded_proc_type = FW_RESET_REQ_EMBEDDED_PROC_TYPE_CHIP;
 		req.selfrst_status = FW_RESET_REQ_SELFRST_STATUS_SELFRSTASAP;
 		break;
+	case BNXT_FW_RESET_AP:
+		req.embedded_proc_type = FW_RESET_REQ_EMBEDDED_PROC_TYPE_AP;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -2522,6 +2525,14 @@ static int bnxt_reset(struct net_device *dev, u32 *flags)
 		rc = bnxt_firmware_reset(dev, BNXT_FW_RESET_CHIP);
 		if (!rc)
 			netdev_info(dev, "Reset request successful. Reload driver to complete reset\n");
+	} else if (*flags == ETH_RESET_AP) {
+		/* This feature is not supported in older firmware versions */
+		if (bp->hwrm_spec_code < 0x10803)
+			return -EOPNOTSUPP;
+
+		rc = bnxt_firmware_reset(dev, BNXT_FW_RESET_AP);
+		if (!rc)
+			netdev_info(dev, "Reset Application Processor request successful.\n");
 	} else {
 		rc = -EINVAL;
 	}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
index ff601b42fcc8..836ef682f24c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
@@ -34,6 +34,7 @@ struct bnxt_led_cfg {
 #define BNXT_LED_DFLT_ENABLES(x)			\
 	cpu_to_le32(BNXT_LED_DFLT_ENA << (BNXT_LED_DFLT_ENA_SHIFT * (x)))
 
+#define BNXT_FW_RESET_AP	0xfffe
 #define BNXT_FW_RESET_CHIP	0xffff
 
 extern const struct ethtool_ops bnxt_ethtool_ops;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 3d201d7324bd..9807214da206 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -54,12 +54,10 @@ static int bnxt_tc_parse_redir(struct bnxt *bp,
 			       struct bnxt_tc_actions *actions,
 			       const struct tc_action *tc_act)
 {
-	int ifindex = tcf_mirred_ifindex(tc_act);
-	struct net_device *dev;
+	struct net_device *dev = tcf_mirred_dev(tc_act);
 
-	dev = __dev_get_by_index(dev_net(bp->dev), ifindex);
 	if (!dev) {
-		netdev_info(bp->dev, "no dev for ifindex=%d", ifindex);
+		netdev_info(bp->dev, "no dev in mirred action");
 		return -EINVAL;
 	}
 
@@ -148,9 +146,6 @@ static int bnxt_tc_parse_actions(struct bnxt *bp,
 		}
 	}
 
-	if (rc)
-		return rc;
-
 	if (actions->flags & BNXT_TC_ACTION_FLAG_FWD) {
 		if (actions->flags & BNXT_TC_ACTION_FLAG_TUNNEL_ENCAP) {
 			/* dst_fid is PF's fid */
@@ -164,7 +159,7 @@ static int bnxt_tc_parse_actions(struct bnxt *bp,
 		}
 	}
 
-	return rc;
+	return 0;
 }
 
 #define GET_KEY(flow_cmd, key_type)					\
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index c93f3a2dc6c1..c50c5ec49b1d 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -164,14 +164,38 @@
 #define GEM_DCFG5		0x0290 /* Design Config 5 */
 #define GEM_DCFG6		0x0294 /* Design Config 6 */
 #define GEM_DCFG7		0x0298 /* Design Config 7 */
+#define GEM_DCFG8		0x029C /* Design Config 8 */
 
 #define GEM_TXBDCTRL	0x04cc /* TX Buffer Descriptor control register */
 #define GEM_RXBDCTRL	0x04d0 /* RX Buffer Descriptor control register */
 
+/* Screener Type 2 match registers */
+#define GEM_SCRT2		0x540
+
+/* EtherType registers */
+#define GEM_ETHT		0x06E0
+
+/* Type 2 compare registers */
+#define GEM_T2CMPW0		0x0700
+#define GEM_T2CMPW1		0x0704
+#define T2CMP_OFST(t2idx)	(t2idx * 2)
+
+/* type 2 compare registers
+ * each location requires 3 compare regs
+ */
+#define GEM_IP4SRC_CMP(idx)		(idx * 3)
+#define GEM_IP4DST_CMP(idx)		(idx * 3 + 1)
+#define GEM_PORT_CMP(idx)		(idx * 3 + 2)
+
+/* Which screening type 2 EtherType register will be used (0 - 7) */
+#define SCRT2_ETHT		0
+
 #define GEM_ISR(hw_q)		(0x0400 + ((hw_q) << 2))
 #define GEM_TBQP(hw_q)		(0x0440 + ((hw_q) << 2))
 #define GEM_TBQPH(hw_q)		(0x04C8)
 #define GEM_RBQP(hw_q)		(0x0480 + ((hw_q) << 2))
+#define GEM_RBQS(hw_q)		(0x04A0 + ((hw_q) << 2))
+#define GEM_RBQPH(hw_q)		(0x04D4)
 #define GEM_IER(hw_q)		(0x0600 + ((hw_q) << 2))
 #define GEM_IDR(hw_q)		(0x0620 + ((hw_q) << 2))
 #define GEM_IMR(hw_q)		(0x0640 + ((hw_q) << 2))
@@ -455,6 +479,16 @@
 #define GEM_DAW64_OFFSET			23
 #define GEM_DAW64_SIZE				1
 
+/* Bitfields in DCFG8. */
+#define GEM_T1SCR_OFFSET			24
+#define GEM_T1SCR_SIZE				8
+#define GEM_T2SCR_OFFSET			16
+#define GEM_T2SCR_SIZE				8
+#define GEM_SCR2ETH_OFFSET			8
+#define GEM_SCR2ETH_SIZE			8
+#define GEM_SCR2CMP_OFFSET			0
+#define GEM_SCR2CMP_SIZE			8
+
 /* Bitfields in TISUBN */
 #define GEM_SUBNSINCR_OFFSET			0
 #define GEM_SUBNSINCR_SIZE			16
@@ -483,6 +517,66 @@
 #define GEM_RXTSMODE_OFFSET			4 /* RX Descriptor Timestamp Insertion mode */
 #define GEM_RXTSMODE_SIZE			2
 
+/* Bitfields in SCRT2 */
+#define GEM_QUEUE_OFFSET			0 /* Queue Number */
+#define GEM_QUEUE_SIZE				4
+#define GEM_VLANPR_OFFSET			4 /* VLAN Priority */
+#define GEM_VLANPR_SIZE				3
+#define GEM_VLANEN_OFFSET			8 /* VLAN Enable */
+#define GEM_VLANEN_SIZE				1
+#define GEM_ETHT2IDX_OFFSET			9 /* Index to screener type 2 EtherType register */
+#define GEM_ETHT2IDX_SIZE			3
+#define GEM_ETHTEN_OFFSET			12 /* EtherType Enable */
+#define GEM_ETHTEN_SIZE				1
+#define GEM_CMPA_OFFSET				13 /* Compare A - Index to screener type 2 Compare register */
+#define GEM_CMPA_SIZE				5
+#define GEM_CMPAEN_OFFSET			18 /* Compare A Enable */
+#define GEM_CMPAEN_SIZE				1
+#define GEM_CMPB_OFFSET				19 /* Compare B - Index to screener type 2 Compare register */
+#define GEM_CMPB_SIZE				5
+#define GEM_CMPBEN_OFFSET			24 /* Compare B Enable */
+#define GEM_CMPBEN_SIZE				1
+#define GEM_CMPC_OFFSET				25 /* Compare C - Index to screener type 2 Compare register */
+#define GEM_CMPC_SIZE				5
+#define GEM_CMPCEN_OFFSET			30 /* Compare C Enable */
+#define GEM_CMPCEN_SIZE				1
+
+/* Bitfields in ETHT */
+#define GEM_ETHTCMP_OFFSET			0 /* EtherType compare value */
+#define GEM_ETHTCMP_SIZE			16
+
+/* Bitfields in T2CMPW0 */
+#define GEM_T2CMP_OFFSET			16 /* 0xFFFF0000 compare value */
+#define GEM_T2CMP_SIZE				16
+#define GEM_T2MASK_OFFSET			0 /* 0x0000FFFF compare value or mask */
+#define GEM_T2MASK_SIZE				16
+
+/* Bitfields in T2CMPW1 */
+#define GEM_T2DISMSK_OFFSET			9 /* disable mask */
+#define GEM_T2DISMSK_SIZE			1
+#define GEM_T2CMPOFST_OFFSET			7 /* compare offset */
+#define GEM_T2CMPOFST_SIZE			2
+#define GEM_T2OFST_OFFSET			0 /* offset value */
+#define GEM_T2OFST_SIZE				7
+
+/* Offset for screener type 2 compare values (T2CMPOFST).
+ * Note the offset is applied after the specified point,
+ * e.g. GEM_T2COMPOFST_ETYPE denotes the EtherType field, so an offset
+ * of 12 bytes from this would be the source IP address in an IP header
+ */
+#define GEM_T2COMPOFST_SOF		0
+#define GEM_T2COMPOFST_ETYPE	1
+#define GEM_T2COMPOFST_IPHDR	2
+#define GEM_T2COMPOFST_TCPUDP	3
+
+/* offset from EtherType to IP address */
+#define ETYPE_SRCIP_OFFSET			12
+#define ETYPE_DSTIP_OFFSET			16
+
+/* offset from IP header to port */
+#define IPHDR_SRCPORT_OFFSET		0
+#define IPHDR_DSTPORT_OFFSET		2
+
 /* Transmit DMA buffer descriptor Word 1 */
 #define GEM_DMA_TXVALID_OFFSET		23 /* timestamp has been captured in the Buffer Descriptor */
 #define GEM_DMA_TXVALID_SIZE		1
@@ -583,6 +677,8 @@
 #define gem_writel(port, reg, value)	(port)->macb_reg_writel((port), GEM_##reg, (value))
 #define queue_readl(queue, reg)		(queue)->bp->macb_reg_readl((queue)->bp, (queue)->reg)
 #define queue_writel(queue, reg, value)	(queue)->bp->macb_reg_writel((queue)->bp, (queue)->reg, (value))
+#define gem_readl_n(port, reg, idx)		(port)->macb_reg_readl((port), GEM_##reg + idx * 4)
+#define gem_writel_n(port, reg, idx, value)	(port)->macb_reg_writel((port), GEM_##reg + idx * 4, (value))
 
 #define PTP_TS_BUFFER_SIZE		128 /* must be power of 2 */
 
@@ -920,13 +1016,42 @@ static const struct gem_statistic gem_statistics[] = {
 
 #define GEM_STATS_LEN ARRAY_SIZE(gem_statistics)
 
+#define QUEUE_STAT_TITLE(title) {	\
+	.stat_string = title,			\
+}
+
+/* per queue statistics, each should be unsigned long type */
+struct queue_stats {
+	union {
+		unsigned long first;
+		unsigned long rx_packets;
+	};
+	unsigned long rx_bytes;
+	unsigned long rx_dropped;
+	unsigned long tx_packets;
+	unsigned long tx_bytes;
+	unsigned long tx_dropped;
+};
+
+static const struct gem_statistic queue_statistics[] = {
+		QUEUE_STAT_TITLE("rx_packets"),
+		QUEUE_STAT_TITLE("rx_bytes"),
+		QUEUE_STAT_TITLE("rx_dropped"),
+		QUEUE_STAT_TITLE("tx_packets"),
+		QUEUE_STAT_TITLE("tx_bytes"),
+		QUEUE_STAT_TITLE("tx_dropped"),
+};
+
+#define QUEUE_STATS_LEN ARRAY_SIZE(queue_statistics)
+
 struct macb;
+struct macb_queue;
 
 struct macb_or_gem_ops {
 	int	(*mog_alloc_rx_buffers)(struct macb *bp);
 	void	(*mog_free_rx_buffers)(struct macb *bp);
 	void	(*mog_init_rings)(struct macb *bp);
-	int	(*mog_rx)(struct macb *bp, int budget);
+	int	(*mog_rx)(struct macb_queue *queue, int budget);
 };
 
 /* MACB-PTP interface: adapt to platform needs. */
@@ -968,6 +1093,9 @@ struct macb_queue {
 	unsigned int		IMR;
 	unsigned int		TBQP;
 	unsigned int		TBQPH;
+	unsigned int		RBQS;
+	unsigned int		RBQP;
+	unsigned int		RBQPH;
 
 	unsigned int		tx_head, tx_tail;
 	struct macb_dma_desc	*tx_ring;
@@ -975,6 +1103,16 @@ struct macb_queue {
 	dma_addr_t		tx_ring_dma;
 	struct work_struct	tx_error_task;
 
+	dma_addr_t		rx_ring_dma;
+	dma_addr_t		rx_buffers_dma;
+	unsigned int		rx_tail;
+	unsigned int		rx_prepared_head;
+	struct macb_dma_desc	*rx_ring;
+	struct sk_buff		**rx_skbuff;
+	void			*rx_buffers;
+	struct napi_struct	napi;
+	struct queue_stats stats;
+
 #ifdef CONFIG_MACB_USE_HWSTAMP
 	struct work_struct	tx_ts_task;
 	unsigned int		tx_ts_head, tx_ts_tail;
@@ -982,6 +1120,16 @@ struct macb_queue {
 #endif
 };
 
+struct ethtool_rx_fs_item {
+	struct ethtool_rx_flow_spec fs;
+	struct list_head list;
+};
+
+struct ethtool_rx_fs_list {
+	struct list_head list;
+	unsigned int count;
+};
+
 struct macb {
 	void __iomem		*regs;
 	bool			native_io;
@@ -990,11 +1138,6 @@ struct macb {
 	u32	(*macb_reg_readl)(struct macb *bp, int offset);
 	void	(*macb_reg_writel)(struct macb *bp, int offset, u32 value);
 
-	unsigned int		rx_tail;
-	unsigned int		rx_prepared_head;
-	struct macb_dma_desc	*rx_ring;
-	struct sk_buff		**rx_skbuff;
-	void			*rx_buffers;
 	size_t			rx_buffer_size;
 
 	unsigned int		rx_ring_size;
@@ -1011,15 +1154,11 @@ struct macb {
 	struct clk		*tx_clk;
 	struct clk		*rx_clk;
 	struct net_device	*dev;
-	struct napi_struct	napi;
 	union {
 		struct macb_stats	macb;
 		struct gem_stats	gem;
 	}			hw_stats;
 
-	dma_addr_t		rx_ring_dma;
-	dma_addr_t		rx_buffers_dma;
-
 	struct macb_or_gem_ops	macbgem_ops;
 
 	struct mii_bus		*mii_bus;
@@ -1032,7 +1171,6 @@ struct macb {
 	unsigned int		dma_burst_length;
 
 	phy_interface_t		phy_interface;
-	struct gpio_desc	*reset_gpio;
 
 	/* AT91RM9200 transmit */
 	struct sk_buff *skb;			/* holds skb until xmit interrupt completes */
@@ -1040,7 +1178,7 @@ struct macb {
 	int skb_length;				/* saved skb length for pci_unmap_single */
 	unsigned int		max_tx_length;
 
-	u64			ethtool_stats[GEM_STATS_LEN];
+	u64			ethtool_stats[GEM_STATS_LEN + QUEUE_STATS_LEN * MACB_MAX_QUEUES];
 
 	unsigned int		rx_frm_len_mask;
 	unsigned int		jumbo_max_len;
@@ -1057,6 +1195,11 @@ struct macb {
 	struct ptp_clock_info ptp_clock_info;
 	struct tsu_incr tsu_incr;
 	struct hwtstamp_config tstamp_config;
+
+	/* RX queue filer rule set*/
+	struct ethtool_rx_fs_list rx_fs_list;
+	spinlock_t rx_fs_lock;
+	unsigned int max_tuples;
 };
 
 #ifdef CONFIG_MACB_USE_HWSTAMP
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 72a67f74b97b..234667eaaa92 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -194,17 +194,17 @@ static unsigned int macb_rx_ring_wrap(struct macb *bp, unsigned int index)
 	return index & (bp->rx_ring_size - 1);
 }
 
-static struct macb_dma_desc *macb_rx_desc(struct macb *bp, unsigned int index)
+static struct macb_dma_desc *macb_rx_desc(struct macb_queue *queue, unsigned int index)
 {
-	index = macb_rx_ring_wrap(bp, index);
-	index = macb_adj_dma_desc_idx(bp, index);
-	return &bp->rx_ring[index];
+	index = macb_rx_ring_wrap(queue->bp, index);
+	index = macb_adj_dma_desc_idx(queue->bp, index);
+	return &queue->rx_ring[index];
 }
 
-static void *macb_rx_buffer(struct macb *bp, unsigned int index)
+static void *macb_rx_buffer(struct macb_queue *queue, unsigned int index)
 {
-	return bp->rx_buffers + bp->rx_buffer_size *
-	       macb_rx_ring_wrap(bp, index);
+	return queue->rx_buffers + queue->bp->rx_buffer_size *
+	       macb_rx_ring_wrap(queue->bp, index);
 }
 
 /* I/O accessors */
@@ -759,7 +759,9 @@ static void macb_tx_error_task(struct work_struct *work)
 					    macb_tx_ring_wrap(bp, tail),
 					    skb->data);
 				bp->dev->stats.tx_packets++;
+				queue->stats.tx_packets++;
 				bp->dev->stats.tx_bytes += skb->len;
+				queue->stats.tx_bytes += skb->len;
 			}
 		} else {
 			/* "Buffers exhausted mid-frame" errors may only happen
@@ -859,7 +861,9 @@ static void macb_tx_interrupt(struct macb_queue *queue)
 					    macb_tx_ring_wrap(bp, tail),
 					    skb->data);
 				bp->dev->stats.tx_packets++;
+				queue->stats.tx_packets++;
 				bp->dev->stats.tx_bytes += skb->len;
+				queue->stats.tx_bytes += skb->len;
 			}
 
 			/* Now we can safely release resources */
@@ -881,24 +885,25 @@ static void macb_tx_interrupt(struct macb_queue *queue)
 		netif_wake_subqueue(bp->dev, queue_index);
 }
 
-static void gem_rx_refill(struct macb *bp)
+static void gem_rx_refill(struct macb_queue *queue)
 {
 	unsigned int		entry;
 	struct sk_buff		*skb;
 	dma_addr_t		paddr;
+	struct macb *bp = queue->bp;
 	struct macb_dma_desc *desc;
 
-	while (CIRC_SPACE(bp->rx_prepared_head, bp->rx_tail,
-			  bp->rx_ring_size) > 0) {
-		entry = macb_rx_ring_wrap(bp, bp->rx_prepared_head);
+	while (CIRC_SPACE(queue->rx_prepared_head, queue->rx_tail,
+			bp->rx_ring_size) > 0) {
+		entry = macb_rx_ring_wrap(bp, queue->rx_prepared_head);
 
 		/* Make hw descriptor updates visible to CPU */
 		rmb();
 
-		bp->rx_prepared_head++;
-		desc = macb_rx_desc(bp, entry);
+		queue->rx_prepared_head++;
+		desc = macb_rx_desc(queue, entry);
 
-		if (!bp->rx_skbuff[entry]) {
+		if (!queue->rx_skbuff[entry]) {
 			/* allocate sk_buff for this free entry in ring */
 			skb = netdev_alloc_skb(bp->dev, bp->rx_buffer_size);
 			if (unlikely(!skb)) {
@@ -916,7 +921,7 @@ static void gem_rx_refill(struct macb *bp)
 				break;
 			}
 
-			bp->rx_skbuff[entry] = skb;
+			queue->rx_skbuff[entry] = skb;
 
 			if (entry == bp->rx_ring_size - 1)
 				paddr |= MACB_BIT(RX_WRAP);
@@ -934,18 +939,18 @@ static void gem_rx_refill(struct macb *bp)
 	/* Make descriptor updates visible to hardware */
 	wmb();
 
-	netdev_vdbg(bp->dev, "rx ring: prepared head %d, tail %d\n",
-		    bp->rx_prepared_head, bp->rx_tail);
+	netdev_vdbg(bp->dev, "rx ring: queue: %p, prepared head %d, tail %d\n",
+			queue, queue->rx_prepared_head, queue->rx_tail);
 }
 
 /* Mark DMA descriptors from begin up to and not including end as unused */
-static void discard_partial_frame(struct macb *bp, unsigned int begin,
+static void discard_partial_frame(struct macb_queue *queue, unsigned int begin,
 				  unsigned int end)
 {
 	unsigned int frag;
 
 	for (frag = begin; frag != end; frag++) {
-		struct macb_dma_desc *desc = macb_rx_desc(bp, frag);
+		struct macb_dma_desc *desc = macb_rx_desc(queue, frag);
 
 		desc->addr &= ~MACB_BIT(RX_USED);
 	}
@@ -959,8 +964,9 @@ static void discard_partial_frame(struct macb *bp, unsigned int begin,
 	 */
 }
 
-static int gem_rx(struct macb *bp, int budget)
+static int gem_rx(struct macb_queue *queue, int budget)
 {
+	struct macb *bp = queue->bp;
 	unsigned int		len;
 	unsigned int		entry;
 	struct sk_buff		*skb;
@@ -972,8 +978,8 @@ static int gem_rx(struct macb *bp, int budget)
 		dma_addr_t addr;
 		bool rxused;
 
-		entry = macb_rx_ring_wrap(bp, bp->rx_tail);
-		desc = macb_rx_desc(bp, entry);
+		entry = macb_rx_ring_wrap(bp, queue->rx_tail);
+		desc = macb_rx_desc(queue, entry);
 
 		/* Make hw descriptor updates visible to CPU */
 		rmb();
@@ -985,24 +991,26 @@ static int gem_rx(struct macb *bp, int budget)
 		if (!rxused)
 			break;
 
-		bp->rx_tail++;
+		queue->rx_tail++;
 		count++;
 
 		if (!(ctrl & MACB_BIT(RX_SOF) && ctrl & MACB_BIT(RX_EOF))) {
 			netdev_err(bp->dev,
 				   "not whole frame pointed by descriptor\n");
 			bp->dev->stats.rx_dropped++;
+			queue->stats.rx_dropped++;
 			break;
 		}
-		skb = bp->rx_skbuff[entry];
+		skb = queue->rx_skbuff[entry];
 		if (unlikely(!skb)) {
 			netdev_err(bp->dev,
 				   "inconsistent Rx descriptor chain\n");
 			bp->dev->stats.rx_dropped++;
+			queue->stats.rx_dropped++;
 			break;
 		}
 		/* now everything is ready for receiving packet */
-		bp->rx_skbuff[entry] = NULL;
+		queue->rx_skbuff[entry] = NULL;
 		len = ctrl & bp->rx_frm_len_mask;
 
 		netdev_vdbg(bp->dev, "gem_rx %u (len %u)\n", entry, len);
@@ -1019,7 +1027,9 @@ static int gem_rx(struct macb *bp, int budget)
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
 
 		bp->dev->stats.rx_packets++;
+		queue->stats.rx_packets++;
 		bp->dev->stats.rx_bytes += skb->len;
+		queue->stats.rx_bytes += skb->len;
 
 		gem_ptp_do_rxstamp(bp, skb, desc);
 
@@ -1035,12 +1045,12 @@ static int gem_rx(struct macb *bp, int budget)
 		netif_receive_skb(skb);
 	}
 
-	gem_rx_refill(bp);
+	gem_rx_refill(queue);
 
 	return count;
 }
 
-static int macb_rx_frame(struct macb *bp, unsigned int first_frag,
+static int macb_rx_frame(struct macb_queue *queue, unsigned int first_frag,
 			 unsigned int last_frag)
 {
 	unsigned int len;
@@ -1048,8 +1058,9 @@ static int macb_rx_frame(struct macb *bp, unsigned int first_frag,
 	unsigned int offset;
 	struct sk_buff *skb;
 	struct macb_dma_desc *desc;
+	struct macb *bp = queue->bp;
 
-	desc = macb_rx_desc(bp, last_frag);
+	desc = macb_rx_desc(queue, last_frag);
 	len = desc->ctrl & bp->rx_frm_len_mask;
 
 	netdev_vdbg(bp->dev, "macb_rx_frame frags %u - %u (len %u)\n",
@@ -1068,7 +1079,7 @@ static int macb_rx_frame(struct macb *bp, unsigned int first_frag,
 	if (!skb) {
 		bp->dev->stats.rx_dropped++;
 		for (frag = first_frag; ; frag++) {
-			desc = macb_rx_desc(bp, frag);
+			desc = macb_rx_desc(queue, frag);
 			desc->addr &= ~MACB_BIT(RX_USED);
 			if (frag == last_frag)
 				break;
@@ -1096,10 +1107,10 @@ static int macb_rx_frame(struct macb *bp, unsigned int first_frag,
 			frag_len = len - offset;
 		}
 		skb_copy_to_linear_data_offset(skb, offset,
-					       macb_rx_buffer(bp, frag),
+					       macb_rx_buffer(queue, frag),
 					       frag_len);
 		offset += bp->rx_buffer_size;
-		desc = macb_rx_desc(bp, frag);
+		desc = macb_rx_desc(queue, frag);
 		desc->addr &= ~MACB_BIT(RX_USED);
 
 		if (frag == last_frag)
@@ -1121,32 +1132,34 @@ static int macb_rx_frame(struct macb *bp, unsigned int first_frag,
 	return 0;
 }
 
-static inline void macb_init_rx_ring(struct macb *bp)
+static inline void macb_init_rx_ring(struct macb_queue *queue)
 {
+	struct macb *bp = queue->bp;
 	dma_addr_t addr;
 	struct macb_dma_desc *desc = NULL;
 	int i;
 
-	addr = bp->rx_buffers_dma;
+	addr = queue->rx_buffers_dma;
 	for (i = 0; i < bp->rx_ring_size; i++) {
-		desc = macb_rx_desc(bp, i);
+		desc = macb_rx_desc(queue, i);
 		macb_set_addr(bp, desc, addr);
 		desc->ctrl = 0;
 		addr += bp->rx_buffer_size;
 	}
 	desc->addr |= MACB_BIT(RX_WRAP);
-	bp->rx_tail = 0;
+	queue->rx_tail = 0;
 }
 
-static int macb_rx(struct macb *bp, int budget)
+static int macb_rx(struct macb_queue *queue, int budget)
 {
+	struct macb *bp = queue->bp;
 	bool reset_rx_queue = false;
 	int received = 0;
 	unsigned int tail;
 	int first_frag = -1;
 
-	for (tail = bp->rx_tail; budget > 0; tail++) {
-		struct macb_dma_desc *desc = macb_rx_desc(bp, tail);
+	for (tail = queue->rx_tail; budget > 0; tail++) {
+		struct macb_dma_desc *desc = macb_rx_desc(queue, tail);
 		u32 ctrl;
 
 		/* Make hw descriptor updates visible to CPU */
@@ -1159,7 +1172,7 @@ static int macb_rx(struct macb *bp, int budget)
 
 		if (ctrl & MACB_BIT(RX_SOF)) {
 			if (first_frag != -1)
-				discard_partial_frame(bp, first_frag, tail);
+				discard_partial_frame(queue, first_frag, tail);
 			first_frag = tail;
 		}
 
@@ -1171,7 +1184,7 @@ static int macb_rx(struct macb *bp, int budget)
 				continue;
 			}
 
-			dropped = macb_rx_frame(bp, first_frag, tail);
+			dropped = macb_rx_frame(queue, first_frag, tail);
 			first_frag = -1;
 			if (unlikely(dropped < 0)) {
 				reset_rx_queue = true;
@@ -1195,8 +1208,8 @@ static int macb_rx(struct macb *bp, int budget)
 		ctrl = macb_readl(bp, NCR);
 		macb_writel(bp, NCR, ctrl & ~MACB_BIT(RE));
 
-		macb_init_rx_ring(bp);
-		macb_writel(bp, RBQP, bp->rx_ring_dma);
+		macb_init_rx_ring(queue);
+		queue_writel(queue, RBQP, queue->rx_ring_dma);
 
 		macb_writel(bp, NCR, ctrl | MACB_BIT(RE));
 
@@ -1205,16 +1218,17 @@ static int macb_rx(struct macb *bp, int budget)
 	}
 
 	if (first_frag != -1)
-		bp->rx_tail = first_frag;
+		queue->rx_tail = first_frag;
 	else
-		bp->rx_tail = tail;
+		queue->rx_tail = tail;
 
 	return received;
 }
 
 static int macb_poll(struct napi_struct *napi, int budget)
 {
-	struct macb *bp = container_of(napi, struct macb, napi);
+	struct macb_queue *queue = container_of(napi, struct macb_queue, napi);
+	struct macb *bp = queue->bp;
 	int work_done;
 	u32 status;
 
@@ -1224,7 +1238,7 @@ static int macb_poll(struct napi_struct *napi, int budget)
 	netdev_vdbg(bp->dev, "poll: status = %08lx, budget = %d\n",
 		    (unsigned long)status, budget);
 
-	work_done = bp->macbgem_ops.mog_rx(bp, budget);
+	work_done = bp->macbgem_ops.mog_rx(queue, budget);
 	if (work_done < budget) {
 		napi_complete_done(napi, work_done);
 
@@ -1232,10 +1246,10 @@ static int macb_poll(struct napi_struct *napi, int budget)
 		status = macb_readl(bp, RSR);
 		if (status) {
 			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
-				macb_writel(bp, ISR, MACB_BIT(RCOMP));
+				queue_writel(queue, ISR, MACB_BIT(RCOMP));
 			napi_reschedule(napi);
 		} else {
-			macb_writel(bp, IER, MACB_RX_INT_FLAGS);
+			queue_writel(queue, IER, MACB_RX_INT_FLAGS);
 		}
 	}
 
@@ -1282,9 +1296,9 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
 			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
 				queue_writel(queue, ISR, MACB_BIT(RCOMP));
 
-			if (napi_schedule_prep(&bp->napi)) {
+			if (napi_schedule_prep(&queue->napi)) {
 				netdev_vdbg(bp->dev, "scheduling RX softirq\n");
-				__napi_schedule(&bp->napi);
+				__napi_schedule(&queue->napi);
 			}
 		}
 
@@ -1708,38 +1722,44 @@ static void gem_free_rx_buffers(struct macb *bp)
 {
 	struct sk_buff		*skb;
 	struct macb_dma_desc	*desc;
+	struct macb_queue *queue;
 	dma_addr_t		addr;
+	unsigned int q;
 	int i;
 
-	if (!bp->rx_skbuff)
-		return;
+	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+		if (!queue->rx_skbuff)
+			continue;
 
-	for (i = 0; i < bp->rx_ring_size; i++) {
-		skb = bp->rx_skbuff[i];
+		for (i = 0; i < bp->rx_ring_size; i++) {
+			skb = queue->rx_skbuff[i];
 
-		if (!skb)
-			continue;
+			if (!skb)
+				continue;
 
-		desc = macb_rx_desc(bp, i);
-		addr = macb_get_addr(bp, desc);
+			desc = macb_rx_desc(queue, i);
+			addr = macb_get_addr(bp, desc);
 
-		dma_unmap_single(&bp->pdev->dev, addr, bp->rx_buffer_size,
-				 DMA_FROM_DEVICE);
-		dev_kfree_skb_any(skb);
-		skb = NULL;
-	}
+			dma_unmap_single(&bp->pdev->dev, addr, bp->rx_buffer_size,
+					DMA_FROM_DEVICE);
+			dev_kfree_skb_any(skb);
+			skb = NULL;
+		}
 
-	kfree(bp->rx_skbuff);
-	bp->rx_skbuff = NULL;
+		kfree(queue->rx_skbuff);
+		queue->rx_skbuff = NULL;
+	}
 }
 
 static void macb_free_rx_buffers(struct macb *bp)
 {
-	if (bp->rx_buffers) {
+	struct macb_queue *queue = &bp->queues[0];
+
+	if (queue->rx_buffers) {
 		dma_free_coherent(&bp->pdev->dev,
 				  bp->rx_ring_size * bp->rx_buffer_size,
-				  bp->rx_buffers, bp->rx_buffers_dma);
-		bp->rx_buffers = NULL;
+				  queue->rx_buffers, queue->rx_buffers_dma);
+		queue->rx_buffers = NULL;
 	}
 }
 
@@ -1748,11 +1768,12 @@ static void macb_free_consistent(struct macb *bp)
 	struct macb_queue *queue;
 	unsigned int q;
 
+	queue = &bp->queues[0];
 	bp->macbgem_ops.mog_free_rx_buffers(bp);
-	if (bp->rx_ring) {
+	if (queue->rx_ring) {
 		dma_free_coherent(&bp->pdev->dev, RX_RING_BYTES(bp),
-				  bp->rx_ring, bp->rx_ring_dma);
-		bp->rx_ring = NULL;
+				queue->rx_ring, queue->rx_ring_dma);
+		queue->rx_ring = NULL;
 	}
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
@@ -1768,32 +1789,37 @@ static void macb_free_consistent(struct macb *bp)
 
 static int gem_alloc_rx_buffers(struct macb *bp)
 {
+	struct macb_queue *queue;
+	unsigned int q;
 	int size;
 
-	size = bp->rx_ring_size * sizeof(struct sk_buff *);
-	bp->rx_skbuff = kzalloc(size, GFP_KERNEL);
-	if (!bp->rx_skbuff)
-		return -ENOMEM;
-	else
-		netdev_dbg(bp->dev,
-			   "Allocated %d RX struct sk_buff entries at %p\n",
-			   bp->rx_ring_size, bp->rx_skbuff);
+	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+		size = bp->rx_ring_size * sizeof(struct sk_buff *);
+		queue->rx_skbuff = kzalloc(size, GFP_KERNEL);
+		if (!queue->rx_skbuff)
+			return -ENOMEM;
+		else
+			netdev_dbg(bp->dev,
+				   "Allocated %d RX struct sk_buff entries at %p\n",
+				   bp->rx_ring_size, queue->rx_skbuff);
+	}
 	return 0;
 }
 
 static int macb_alloc_rx_buffers(struct macb *bp)
 {
+	struct macb_queue *queue = &bp->queues[0];
 	int size;
 
 	size = bp->rx_ring_size * bp->rx_buffer_size;
-	bp->rx_buffers = dma_alloc_coherent(&bp->pdev->dev, size,
-					    &bp->rx_buffers_dma, GFP_KERNEL);
-	if (!bp->rx_buffers)
+	queue->rx_buffers = dma_alloc_coherent(&bp->pdev->dev, size,
+					    &queue->rx_buffers_dma, GFP_KERNEL);
+	if (!queue->rx_buffers)
 		return -ENOMEM;
 
 	netdev_dbg(bp->dev,
 		   "Allocated RX buffers of %d bytes at %08lx (mapped %p)\n",
-		   size, (unsigned long)bp->rx_buffers_dma, bp->rx_buffers);
+		   size, (unsigned long)queue->rx_buffers_dma, queue->rx_buffers);
 	return 0;
 }
 
@@ -1819,17 +1845,16 @@ static int macb_alloc_consistent(struct macb *bp)
 		queue->tx_skb = kmalloc(size, GFP_KERNEL);
 		if (!queue->tx_skb)
 			goto out_err;
-	}
-
-	size = RX_RING_BYTES(bp);
-	bp->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
-					 &bp->rx_ring_dma, GFP_KERNEL);
-	if (!bp->rx_ring)
-		goto out_err;
-	netdev_dbg(bp->dev,
-		   "Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
-		   size, (unsigned long)bp->rx_ring_dma, bp->rx_ring);
 
+		size = RX_RING_BYTES(bp);
+		queue->rx_ring = dma_alloc_coherent(&bp->pdev->dev, size,
+						 &queue->rx_ring_dma, GFP_KERNEL);
+		if (!queue->rx_ring)
+			goto out_err;
+		netdev_dbg(bp->dev,
+			   "Allocated RX ring of %d bytes at %08lx (mapped %p)\n",
+			   size, (unsigned long)queue->rx_ring_dma, queue->rx_ring);
+	}
 	if (bp->macbgem_ops.mog_alloc_rx_buffers(bp))
 		goto out_err;
 
@@ -1856,12 +1881,13 @@ static void gem_init_rings(struct macb *bp)
 		desc->ctrl |= MACB_BIT(TX_WRAP);
 		queue->tx_head = 0;
 		queue->tx_tail = 0;
-	}
 
-	bp->rx_tail = 0;
-	bp->rx_prepared_head = 0;
+		queue->rx_tail = 0;
+		queue->rx_prepared_head = 0;
+
+		gem_rx_refill(queue);
+	}
 
-	gem_rx_refill(bp);
 }
 
 static void macb_init_rings(struct macb *bp)
@@ -1869,7 +1895,7 @@ static void macb_init_rings(struct macb *bp)
 	int i;
 	struct macb_dma_desc *desc = NULL;
 
-	macb_init_rx_ring(bp);
+	macb_init_rx_ring(&bp->queues[0]);
 
 	for (i = 0; i < bp->tx_ring_size; i++) {
 		desc = macb_tx_desc(&bp->queues[0], i);
@@ -1978,11 +2004,20 @@ static u32 macb_dbw(struct macb *bp)
  */
 static void macb_configure_dma(struct macb *bp)
 {
+	struct macb_queue *queue;
+	u32 buffer_size;
+	unsigned int q;
 	u32 dmacfg;
 
+	buffer_size = bp->rx_buffer_size / RX_BUFFER_MULTIPLE;
 	if (macb_is_gem(bp)) {
 		dmacfg = gem_readl(bp, DMACFG) & ~GEM_BF(RXBS, -1L);
-		dmacfg |= GEM_BF(RXBS, bp->rx_buffer_size / RX_BUFFER_MULTIPLE);
+		for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+			if (q)
+				queue_writel(queue, RBQS, buffer_size);
+			else
+				dmacfg |= GEM_BF(RXBS, buffer_size);
+		}
 		if (bp->dma_burst_length)
 			dmacfg = GEM_BFINS(FBLDO, bp->dma_burst_length, dmacfg);
 		dmacfg |= GEM_BIT(TXPBMS) | GEM_BF(RXBMS, -1L);
@@ -2051,12 +2086,12 @@ static void macb_init_hw(struct macb *bp)
 	macb_configure_dma(bp);
 
 	/* Initialize TX and RX buffers */
-	macb_writel(bp, RBQP, lower_32_bits(bp->rx_ring_dma));
+	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+		queue_writel(queue, RBQP, lower_32_bits(queue->rx_ring_dma));
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
-	if (bp->hw_dma_cap & HW_DMA_CAP_64B)
-		macb_writel(bp, RBQPH, upper_32_bits(bp->rx_ring_dma));
+		if (bp->hw_dma_cap & HW_DMA_CAP_64B)
+			queue_writel(queue, RBQPH, upper_32_bits(queue->rx_ring_dma));
 #endif
-	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
 		queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
 		if (bp->hw_dma_cap & HW_DMA_CAP_64B)
@@ -2197,6 +2232,8 @@ static int macb_open(struct net_device *dev)
 {
 	struct macb *bp = netdev_priv(dev);
 	size_t bufsz = dev->mtu + ETH_HLEN + ETH_FCS_LEN + NET_IP_ALIGN;
+	struct macb_queue *queue;
+	unsigned int q;
 	int err;
 
 	netdev_dbg(bp->dev, "open\n");
@@ -2218,11 +2255,12 @@ static int macb_open(struct net_device *dev)
 		return err;
 	}
 
-	napi_enable(&bp->napi);
-
 	bp->macbgem_ops.mog_init_rings(bp);
 	macb_init_hw(bp);
 
+	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
+		napi_enable(&queue->napi);
+
 	/* schedule a link state check */
 	phy_start(dev->phydev);
 
@@ -2237,10 +2275,14 @@ static int macb_open(struct net_device *dev)
 static int macb_close(struct net_device *dev)
 {
 	struct macb *bp = netdev_priv(dev);
+	struct macb_queue *queue;
 	unsigned long flags;
+	unsigned int q;
 
 	netif_tx_stop_all_queues(dev);
-	napi_disable(&bp->napi);
+
+	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
+		napi_disable(&queue->napi);
 
 	if (dev->phydev)
 		phy_stop(dev->phydev);
@@ -2270,7 +2312,10 @@ static int macb_change_mtu(struct net_device *dev, int new_mtu)
 
 static void gem_update_stats(struct macb *bp)
 {
-	unsigned int i;
+	struct macb_queue *queue;
+	unsigned int i, q, idx;
+	unsigned long *stat;
+
 	u32 *p = &bp->hw_stats.gem.tx_octets_31_0;
 
 	for (i = 0; i < GEM_STATS_LEN; ++i, ++p) {
@@ -2287,6 +2332,11 @@ static void gem_update_stats(struct macb *bp)
 			*(++p) += val;
 		}
 	}
+
+	idx = GEM_STATS_LEN;
+	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
+		for (i = 0, stat = &queue->stats.first; i < QUEUE_STATS_LEN; ++i, ++stat)
+			bp->ethtool_stats[idx++] = *stat;
 }
 
 static struct net_device_stats *gem_get_stats(struct macb *bp)
@@ -2334,14 +2384,17 @@ static void gem_get_ethtool_stats(struct net_device *dev,
 
 	bp = netdev_priv(dev);
 	gem_update_stats(bp);
-	memcpy(data, &bp->ethtool_stats, sizeof(u64) * GEM_STATS_LEN);
+	memcpy(data, &bp->ethtool_stats, sizeof(u64)
+			* (GEM_STATS_LEN + QUEUE_STATS_LEN * MACB_MAX_QUEUES));
 }
 
 static int gem_get_sset_count(struct net_device *dev, int sset)
 {
+	struct macb *bp = netdev_priv(dev);
+
 	switch (sset) {
 	case ETH_SS_STATS:
-		return GEM_STATS_LEN;
+		return GEM_STATS_LEN + bp->num_queues * QUEUE_STATS_LEN;
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -2349,13 +2402,25 @@ static int gem_get_sset_count(struct net_device *dev, int sset)
 
 static void gem_get_ethtool_strings(struct net_device *dev, u32 sset, u8 *p)
 {
+	char stat_string[ETH_GSTRING_LEN];
+	struct macb *bp = netdev_priv(dev);
+	struct macb_queue *queue;
 	unsigned int i;
+	unsigned int q;
 
 	switch (sset) {
 	case ETH_SS_STATS:
 		for (i = 0; i < GEM_STATS_LEN; i++, p += ETH_GSTRING_LEN)
 			memcpy(p, gem_statistics[i].stat_string,
 			       ETH_GSTRING_LEN);
+
+		for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
+			for (i = 0; i < QUEUE_STATS_LEN; i++, p += ETH_GSTRING_LEN) {
+				snprintf(stat_string, ETH_GSTRING_LEN, "q%d_%s",
+						q, queue_statistics[i].stat_string);
+				memcpy(p, stat_string, ETH_GSTRING_LEN);
+			}
+		}
 		break;
 	}
 }
@@ -2603,6 +2668,307 @@ static int macb_get_ts_info(struct net_device *netdev,
 	return ethtool_op_get_ts_info(netdev, info);
 }
 
+static void gem_enable_flow_filters(struct macb *bp, bool enable)
+{
+	struct ethtool_rx_fs_item *item;
+	u32 t2_scr;
+	int num_t2_scr;
+
+	num_t2_scr = GEM_BFEXT(T2SCR, gem_readl(bp, DCFG8));
+
+	list_for_each_entry(item, &bp->rx_fs_list.list, list) {
+		struct ethtool_rx_flow_spec *fs = &item->fs;
+		struct ethtool_tcpip4_spec *tp4sp_m;
+
+		if (fs->location >= num_t2_scr)
+			continue;
+
+		t2_scr = gem_readl_n(bp, SCRT2, fs->location);
+
+		/* enable/disable screener regs for the flow entry */
+		t2_scr = GEM_BFINS(ETHTEN, enable, t2_scr);
+
+		/* only enable fields with no masking */
+		tp4sp_m = &(fs->m_u.tcp_ip4_spec);
+
+		if (enable && (tp4sp_m->ip4src == 0xFFFFFFFF))
+			t2_scr = GEM_BFINS(CMPAEN, 1, t2_scr);
+		else
+			t2_scr = GEM_BFINS(CMPAEN, 0, t2_scr);
+
+		if (enable && (tp4sp_m->ip4dst == 0xFFFFFFFF))
+			t2_scr = GEM_BFINS(CMPBEN, 1, t2_scr);
+		else
+			t2_scr = GEM_BFINS(CMPBEN, 0, t2_scr);
+
+		if (enable && ((tp4sp_m->psrc == 0xFFFF) || (tp4sp_m->pdst == 0xFFFF)))
+			t2_scr = GEM_BFINS(CMPCEN, 1, t2_scr);
+		else
+			t2_scr = GEM_BFINS(CMPCEN, 0, t2_scr);
+
+		gem_writel_n(bp, SCRT2, fs->location, t2_scr);
+	}
+}
+
+static void gem_prog_cmp_regs(struct macb *bp, struct ethtool_rx_flow_spec *fs)
+{
+	struct ethtool_tcpip4_spec *tp4sp_v, *tp4sp_m;
+	uint16_t index = fs->location;
+	u32 w0, w1, t2_scr;
+	bool cmp_a = false;
+	bool cmp_b = false;
+	bool cmp_c = false;
+
+	tp4sp_v = &(fs->h_u.tcp_ip4_spec);
+	tp4sp_m = &(fs->m_u.tcp_ip4_spec);
+
+	/* ignore field if any masking set */
+	if (tp4sp_m->ip4src == 0xFFFFFFFF) {
+		/* 1st compare reg - IP source address */
+		w0 = 0;
+		w1 = 0;
+		w0 = tp4sp_v->ip4src;
+		w1 = GEM_BFINS(T2DISMSK, 1, w1); /* 32-bit compare */
+		w1 = GEM_BFINS(T2CMPOFST, GEM_T2COMPOFST_ETYPE, w1);
+		w1 = GEM_BFINS(T2OFST, ETYPE_SRCIP_OFFSET, w1);
+		gem_writel_n(bp, T2CMPW0, T2CMP_OFST(GEM_IP4SRC_CMP(index)), w0);
+		gem_writel_n(bp, T2CMPW1, T2CMP_OFST(GEM_IP4SRC_CMP(index)), w1);
+		cmp_a = true;
+	}
+
+	/* ignore field if any masking set */
+	if (tp4sp_m->ip4dst == 0xFFFFFFFF) {
+		/* 2nd compare reg - IP destination address */
+		w0 = 0;
+		w1 = 0;
+		w0 = tp4sp_v->ip4dst;
+		w1 = GEM_BFINS(T2DISMSK, 1, w1); /* 32-bit compare */
+		w1 = GEM_BFINS(T2CMPOFST, GEM_T2COMPOFST_ETYPE, w1);
+		w1 = GEM_BFINS(T2OFST, ETYPE_DSTIP_OFFSET, w1);
+		gem_writel_n(bp, T2CMPW0, T2CMP_OFST(GEM_IP4DST_CMP(index)), w0);
+		gem_writel_n(bp, T2CMPW1, T2CMP_OFST(GEM_IP4DST_CMP(index)), w1);
+		cmp_b = true;
+	}
+
+	/* ignore both port fields if masking set in both */
+	if ((tp4sp_m->psrc == 0xFFFF) || (tp4sp_m->pdst == 0xFFFF)) {
+		/* 3rd compare reg - source port, destination port */
+		w0 = 0;
+		w1 = 0;
+		w1 = GEM_BFINS(T2CMPOFST, GEM_T2COMPOFST_IPHDR, w1);
+		if (tp4sp_m->psrc == tp4sp_m->pdst) {
+			w0 = GEM_BFINS(T2MASK, tp4sp_v->psrc, w0);
+			w0 = GEM_BFINS(T2CMP, tp4sp_v->pdst, w0);
+			w1 = GEM_BFINS(T2DISMSK, 1, w1); /* 32-bit compare */
+			w1 = GEM_BFINS(T2OFST, IPHDR_SRCPORT_OFFSET, w1);
+		} else {
+			/* only one port definition */
+			w1 = GEM_BFINS(T2DISMSK, 0, w1); /* 16-bit compare */
+			w0 = GEM_BFINS(T2MASK, 0xFFFF, w0);
+			if (tp4sp_m->psrc == 0xFFFF) { /* src port */
+				w0 = GEM_BFINS(T2CMP, tp4sp_v->psrc, w0);
+				w1 = GEM_BFINS(T2OFST, IPHDR_SRCPORT_OFFSET, w1);
+			} else { /* dst port */
+				w0 = GEM_BFINS(T2CMP, tp4sp_v->pdst, w0);
+				w1 = GEM_BFINS(T2OFST, IPHDR_DSTPORT_OFFSET, w1);
+			}
+		}
+		gem_writel_n(bp, T2CMPW0, T2CMP_OFST(GEM_PORT_CMP(index)), w0);
+		gem_writel_n(bp, T2CMPW1, T2CMP_OFST(GEM_PORT_CMP(index)), w1);
+		cmp_c = true;
+	}
+
+	t2_scr = 0;
+	t2_scr = GEM_BFINS(QUEUE, (fs->ring_cookie) & 0xFF, t2_scr);
+	t2_scr = GEM_BFINS(ETHT2IDX, SCRT2_ETHT, t2_scr);
+	if (cmp_a)
+		t2_scr = GEM_BFINS(CMPA, GEM_IP4SRC_CMP(index), t2_scr);
+	if (cmp_b)
+		t2_scr = GEM_BFINS(CMPB, GEM_IP4DST_CMP(index), t2_scr);
+	if (cmp_c)
+		t2_scr = GEM_BFINS(CMPC, GEM_PORT_CMP(index), t2_scr);
+	gem_writel_n(bp, SCRT2, index, t2_scr);
+}
+
+static int gem_add_flow_filter(struct net_device *netdev,
+		struct ethtool_rxnfc *cmd)
+{
+	struct macb *bp = netdev_priv(netdev);
+	struct ethtool_rx_flow_spec *fs = &cmd->fs;
+	struct ethtool_rx_fs_item *item, *newfs;
+	unsigned long flags;
+	int ret = -EINVAL;
+	bool added = false;
+
+	newfs = kmalloc(sizeof(*newfs), GFP_KERNEL);
+	if (newfs == NULL)
+		return -ENOMEM;
+	memcpy(&newfs->fs, fs, sizeof(newfs->fs));
+
+	netdev_dbg(netdev,
+			"Adding flow filter entry,type=%u,queue=%u,loc=%u,src=%08X,dst=%08X,ps=%u,pd=%u\n",
+			fs->flow_type, (int)fs->ring_cookie, fs->location,
+			htonl(fs->h_u.tcp_ip4_spec.ip4src),
+			htonl(fs->h_u.tcp_ip4_spec.ip4dst),
+			htons(fs->h_u.tcp_ip4_spec.psrc), htons(fs->h_u.tcp_ip4_spec.pdst));
+
+	spin_lock_irqsave(&bp->rx_fs_lock, flags);
+
+	/* find correct place to add in list */
+	list_for_each_entry(item, &bp->rx_fs_list.list, list) {
+		if (item->fs.location > newfs->fs.location) {
+			list_add_tail(&newfs->list, &item->list);
+			added = true;
+			break;
+		} else if (item->fs.location == fs->location) {
+			netdev_err(netdev, "Rule not added: location %d not free!\n",
+					fs->location);
+			ret = -EBUSY;
+			goto err;
+		}
+	}
+	if (!added)
+		list_add_tail(&newfs->list, &bp->rx_fs_list.list);
+
+	gem_prog_cmp_regs(bp, fs);
+	bp->rx_fs_list.count++;
+	/* enable filtering if NTUPLE on */
+	if (netdev->features & NETIF_F_NTUPLE)
+		gem_enable_flow_filters(bp, 1);
+
+	spin_unlock_irqrestore(&bp->rx_fs_lock, flags);
+	return 0;
+
+err:
+	spin_unlock_irqrestore(&bp->rx_fs_lock, flags);
+	kfree(newfs);
+	return ret;
+}
+
+static int gem_del_flow_filter(struct net_device *netdev,
+		struct ethtool_rxnfc *cmd)
+{
+	struct macb *bp = netdev_priv(netdev);
+	struct ethtool_rx_fs_item *item;
+	struct ethtool_rx_flow_spec *fs;
+	unsigned long flags;
+
+	spin_lock_irqsave(&bp->rx_fs_lock, flags);
+
+	list_for_each_entry(item, &bp->rx_fs_list.list, list) {
+		if (item->fs.location == cmd->fs.location) {
+			/* disable screener regs for the flow entry */
+			fs = &(item->fs);
+			netdev_dbg(netdev,
+					"Deleting flow filter entry,type=%u,queue=%u,loc=%u,src=%08X,dst=%08X,ps=%u,pd=%u\n",
+					fs->flow_type, (int)fs->ring_cookie, fs->location,
+					htonl(fs->h_u.tcp_ip4_spec.ip4src),
+					htonl(fs->h_u.tcp_ip4_spec.ip4dst),
+					htons(fs->h_u.tcp_ip4_spec.psrc),
+					htons(fs->h_u.tcp_ip4_spec.pdst));
+
+			gem_writel_n(bp, SCRT2, fs->location, 0);
+
+			list_del(&item->list);
+			bp->rx_fs_list.count--;
+			spin_unlock_irqrestore(&bp->rx_fs_lock, flags);
+			kfree(item);
+			return 0;
+		}
+	}
+
+	spin_unlock_irqrestore(&bp->rx_fs_lock, flags);
+	return -EINVAL;
+}
+
+static int gem_get_flow_entry(struct net_device *netdev,
+		struct ethtool_rxnfc *cmd)
+{
+	struct macb *bp = netdev_priv(netdev);
+	struct ethtool_rx_fs_item *item;
+
+	list_for_each_entry(item, &bp->rx_fs_list.list, list) {
+		if (item->fs.location == cmd->fs.location) {
+			memcpy(&cmd->fs, &item->fs, sizeof(cmd->fs));
+			return 0;
+		}
+	}
+	return -EINVAL;
+}
+
+static int gem_get_all_flow_entries(struct net_device *netdev,
+		struct ethtool_rxnfc *cmd, u32 *rule_locs)
+{
+	struct macb *bp = netdev_priv(netdev);
+	struct ethtool_rx_fs_item *item;
+	uint32_t cnt = 0;
+
+	list_for_each_entry(item, &bp->rx_fs_list.list, list) {
+		if (cnt == cmd->rule_cnt)
+			return -EMSGSIZE;
+		rule_locs[cnt] = item->fs.location;
+		cnt++;
+	}
+	cmd->data = bp->max_tuples;
+	cmd->rule_cnt = cnt;
+
+	return 0;
+}
+
+static int gem_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd,
+		u32 *rule_locs)
+{
+	struct macb *bp = netdev_priv(netdev);
+	int ret = 0;
+
+	switch (cmd->cmd) {
+	case ETHTOOL_GRXRINGS:
+		cmd->data = bp->num_queues;
+		break;
+	case ETHTOOL_GRXCLSRLCNT:
+		cmd->rule_cnt = bp->rx_fs_list.count;
+		break;
+	case ETHTOOL_GRXCLSRULE:
+		ret = gem_get_flow_entry(netdev, cmd);
+		break;
+	case ETHTOOL_GRXCLSRLALL:
+		ret = gem_get_all_flow_entries(netdev, cmd, rule_locs);
+		break;
+	default:
+		netdev_err(netdev,
+			  "Command parameter %d is not supported\n", cmd->cmd);
+		ret = -EOPNOTSUPP;
+	}
+
+	return ret;
+}
+
+static int gem_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
+{
+	struct macb *bp = netdev_priv(netdev);
+	int ret;
+
+	switch (cmd->cmd) {
+	case ETHTOOL_SRXCLSRLINS:
+		if ((cmd->fs.location >= bp->max_tuples)
+				|| (cmd->fs.ring_cookie >= bp->num_queues)) {
+			ret = -EINVAL;
+			break;
+		}
+		ret = gem_add_flow_filter(netdev, cmd);
+		break;
+	case ETHTOOL_SRXCLSRLDEL:
+		ret = gem_del_flow_filter(netdev, cmd);
+		break;
+	default:
+		netdev_err(netdev,
+			  "Command parameter %d is not supported\n", cmd->cmd);
+		ret = -EOPNOTSUPP;
+	}
+
+	return ret;
+}
+
 static const struct ethtool_ops macb_ethtool_ops = {
 	.get_regs_len		= macb_get_regs_len,
 	.get_regs		= macb_get_regs,
@@ -2628,6 +2994,8 @@ static const struct ethtool_ops gem_ethtool_ops = {
 	.set_link_ksettings     = phy_ethtool_set_link_ksettings,
 	.get_ringparam		= macb_get_ringparam,
 	.set_ringparam		= macb_set_ringparam,
+	.get_rxnfc			= gem_get_rxnfc,
+	.set_rxnfc			= gem_set_rxnfc,
 };
 
 static int macb_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
@@ -2685,6 +3053,12 @@ static int macb_set_features(struct net_device *netdev,
 		gem_writel(bp, NCFGR, netcfg);
 	}
 
+	/* RX Flow Filters */
+	if ((changed & NETIF_F_NTUPLE) && macb_is_gem(bp)) {
+		bool turn_on = features & NETIF_F_NTUPLE;
+
+		gem_enable_flow_filters(bp, turn_on);
+	}
 	return 0;
 }
 
@@ -2850,7 +3224,7 @@ static int macb_init(struct platform_device *pdev)
 	struct macb *bp = netdev_priv(dev);
 	struct macb_queue *queue;
 	int err;
-	u32 val;
+	u32 val, reg;
 
 	bp->tx_ring_size = DEFAULT_TX_RING_SIZE;
 	bp->rx_ring_size = DEFAULT_RX_RING_SIZE;
@@ -2865,15 +3239,20 @@ static int macb_init(struct platform_device *pdev)
 
 		queue = &bp->queues[q];
 		queue->bp = bp;
+		netif_napi_add(dev, &queue->napi, macb_poll, 64);
 		if (hw_q) {
 			queue->ISR  = GEM_ISR(hw_q - 1);
 			queue->IER  = GEM_IER(hw_q - 1);
 			queue->IDR  = GEM_IDR(hw_q - 1);
 			queue->IMR  = GEM_IMR(hw_q - 1);
 			queue->TBQP = GEM_TBQP(hw_q - 1);
+			queue->RBQP = GEM_RBQP(hw_q - 1);
+			queue->RBQS = GEM_RBQS(hw_q - 1);
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
-			if (bp->hw_dma_cap & HW_DMA_CAP_64B)
+			if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
 				queue->TBQPH = GEM_TBQPH(hw_q - 1);
+				queue->RBQPH = GEM_RBQPH(hw_q - 1);
+			}
 #endif
 		} else {
 			/* queue0 uses legacy registers */
@@ -2882,9 +3261,12 @@ static int macb_init(struct platform_device *pdev)
 			queue->IDR  = MACB_IDR;
 			queue->IMR  = MACB_IMR;
 			queue->TBQP = MACB_TBQP;
+			queue->RBQP = MACB_RBQP;
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
-			if (bp->hw_dma_cap & HW_DMA_CAP_64B)
+			if (bp->hw_dma_cap & HW_DMA_CAP_64B) {
 				queue->TBQPH = MACB_TBQPH;
+				queue->RBQPH = MACB_RBQPH;
+			}
 #endif
 		}
 
@@ -2908,7 +3290,6 @@ static int macb_init(struct platform_device *pdev)
 	}
 
 	dev->netdev_ops = &macb_netdev_ops;
-	netif_napi_add(dev, &bp->napi, macb_poll, 64);
 
 	/* setup appropriated routines according to adapter type */
 	if (macb_is_gem(bp)) {
@@ -2941,6 +3322,30 @@ static int macb_init(struct platform_device *pdev)
 		dev->hw_features &= ~NETIF_F_SG;
 	dev->features = dev->hw_features;
 
+	/* Check RX Flow Filters support.
+	 * Max Rx flows set by availability of screeners & compare regs:
+	 * each 4-tuple define requires 1 T2 screener reg + 3 compare regs
+	 */
+	reg = gem_readl(bp, DCFG8);
+	bp->max_tuples = min((GEM_BFEXT(SCR2CMP, reg) / 3),
+			GEM_BFEXT(T2SCR, reg));
+	if (bp->max_tuples > 0) {
+		/* also needs one ethtype match to check IPv4 */
+		if (GEM_BFEXT(SCR2ETH, reg) > 0) {
+			/* program this reg now */
+			reg = 0;
+			reg = GEM_BFINS(ETHTCMP, (uint16_t)ETH_P_IP, reg);
+			gem_writel_n(bp, ETHT, SCRT2_ETHT, reg);
+			/* Filtering is supported in hw but don't enable it in kernel now */
+			dev->hw_features |= NETIF_F_NTUPLE;
+			/* init Rx flow definitions */
+			INIT_LIST_HEAD(&bp->rx_fs_list.list);
+			bp->rx_fs_list.count = 0;
+			spin_lock_init(&bp->rx_fs_lock);
+		} else
+			bp->max_tuples = 0;
+	}
+
 	if (!(bp->caps & MACB_CAPS_USRIO_DISABLED)) {
 		val = 0;
 		if (bp->phy_interface == PHY_INTERFACE_MODE_RGMII)
@@ -2977,34 +3382,35 @@ static int macb_init(struct platform_device *pdev)
 static int at91ether_start(struct net_device *dev)
 {
 	struct macb *lp = netdev_priv(dev);
+	struct macb_queue *q = &lp->queues[0];
 	struct macb_dma_desc *desc;
 	dma_addr_t addr;
 	u32 ctl;
 	int i;
 
-	lp->rx_ring = dma_alloc_coherent(&lp->pdev->dev,
+	q->rx_ring = dma_alloc_coherent(&lp->pdev->dev,
 					 (AT91ETHER_MAX_RX_DESCR *
 					  macb_dma_desc_get_size(lp)),
-					 &lp->rx_ring_dma, GFP_KERNEL);
-	if (!lp->rx_ring)
+					 &q->rx_ring_dma, GFP_KERNEL);
+	if (!q->rx_ring)
 		return -ENOMEM;
 
-	lp->rx_buffers = dma_alloc_coherent(&lp->pdev->dev,
+	q->rx_buffers = dma_alloc_coherent(&lp->pdev->dev,
 					    AT91ETHER_MAX_RX_DESCR *
 					    AT91ETHER_MAX_RBUFF_SZ,
-					    &lp->rx_buffers_dma, GFP_KERNEL);
-	if (!lp->rx_buffers) {
+					    &q->rx_buffers_dma, GFP_KERNEL);
+	if (!q->rx_buffers) {
 		dma_free_coherent(&lp->pdev->dev,
 				  AT91ETHER_MAX_RX_DESCR *
 				  macb_dma_desc_get_size(lp),
-				  lp->rx_ring, lp->rx_ring_dma);
-		lp->rx_ring = NULL;
+				  q->rx_ring, q->rx_ring_dma);
+		q->rx_ring = NULL;
 		return -ENOMEM;
 	}
 
-	addr = lp->rx_buffers_dma;
+	addr = q->rx_buffers_dma;
 	for (i = 0; i < AT91ETHER_MAX_RX_DESCR; i++) {
-		desc = macb_rx_desc(lp, i);
+		desc = macb_rx_desc(q, i);
 		macb_set_addr(lp, desc, addr);
 		desc->ctrl = 0;
 		addr += AT91ETHER_MAX_RBUFF_SZ;
@@ -3014,10 +3420,10 @@ static int at91ether_start(struct net_device *dev)
 	desc->addr |= MACB_BIT(RX_WRAP);
 
 	/* Reset buffer index */
-	lp->rx_tail = 0;
+	q->rx_tail = 0;
 
 	/* Program address of descriptor list in Rx Buffer Queue register */
-	macb_writel(lp, RBQP, lp->rx_ring_dma);
+	macb_writel(lp, RBQP, q->rx_ring_dma);
 
 	/* Enable Receive and Transmit */
 	ctl = macb_readl(lp, NCR);
@@ -3064,6 +3470,7 @@ static int at91ether_open(struct net_device *dev)
 static int at91ether_close(struct net_device *dev)
 {
 	struct macb *lp = netdev_priv(dev);
+	struct macb_queue *q = &lp->queues[0];
 	u32 ctl;
 
 	/* Disable Receiver and Transmitter */
@@ -3084,13 +3491,13 @@ static int at91ether_close(struct net_device *dev)
 	dma_free_coherent(&lp->pdev->dev,
 			  AT91ETHER_MAX_RX_DESCR *
 			  macb_dma_desc_get_size(lp),
-			  lp->rx_ring, lp->rx_ring_dma);
-	lp->rx_ring = NULL;
+			  q->rx_ring, q->rx_ring_dma);
+	q->rx_ring = NULL;
 
 	dma_free_coherent(&lp->pdev->dev,
 			  AT91ETHER_MAX_RX_DESCR * AT91ETHER_MAX_RBUFF_SZ,
-			  lp->rx_buffers, lp->rx_buffers_dma);
-	lp->rx_buffers = NULL;
+			  q->rx_buffers, q->rx_buffers_dma);
+	q->rx_buffers = NULL;
 
 	return 0;
 }
@@ -3134,14 +3541,15 @@ static int at91ether_start_xmit(struct sk_buff *skb, struct net_device *dev)
 static void at91ether_rx(struct net_device *dev)
 {
 	struct macb *lp = netdev_priv(dev);
+	struct macb_queue *q = &lp->queues[0];
 	struct macb_dma_desc *desc;
 	unsigned char *p_recv;
 	struct sk_buff *skb;
 	unsigned int pktlen;
 
-	desc = macb_rx_desc(lp, lp->rx_tail);
+	desc = macb_rx_desc(q, q->rx_tail);
 	while (desc->addr & MACB_BIT(RX_USED)) {
-		p_recv = lp->rx_buffers + lp->rx_tail * AT91ETHER_MAX_RBUFF_SZ;
+		p_recv = q->rx_buffers + q->rx_tail * AT91ETHER_MAX_RBUFF_SZ;
 		pktlen = MACB_BF(RX_FRMLEN, desc->ctrl);
 		skb = netdev_alloc_skb(dev, pktlen + 2);
 		if (skb) {
@@ -3163,12 +3571,12 @@ static void at91ether_rx(struct net_device *dev)
 		desc->addr &= ~MACB_BIT(RX_USED);
 
 		/* wrap after last buffer */
-		if (lp->rx_tail == AT91ETHER_MAX_RX_DESCR - 1)
-			lp->rx_tail = 0;
+		if (q->rx_tail == AT91ETHER_MAX_RX_DESCR - 1)
+			q->rx_tail = 0;
 		else
-			lp->rx_tail++;
+			q->rx_tail++;
 
-		desc = macb_rx_desc(lp, lp->rx_tail);
+		desc = macb_rx_desc(q, q->rx_tail);
 	}
 }
 
@@ -3394,7 +3802,6 @@ static int macb_probe(struct platform_device *pdev)
 					      = macb_config->clk_init;
 	int (*init)(struct platform_device *) = macb_config->init;
 	struct device_node *np = pdev->dev.of_node;
-	struct device_node *phy_node;
 	struct clk *pclk, *hclk = NULL, *tx_clk = NULL, *rx_clk = NULL;
 	unsigned int queue_mask, num_queues;
 	struct macb_platform_data *pdata;
@@ -3500,18 +3907,6 @@ static int macb_probe(struct platform_device *pdev)
 	else
 		macb_get_hwaddr(bp);
 
-	/* Power up the PHY if there is a GPIO reset */
-	phy_node =  of_get_next_available_child(np, NULL);
-	if (phy_node) {
-		int gpio = of_get_named_gpio(phy_node, "reset-gpios", 0);
-
-		if (gpio_is_valid(gpio)) {
-			bp->reset_gpio = gpio_to_desc(gpio);
-			gpiod_direction_output(bp->reset_gpio, 1);
-		}
-	}
-	of_node_put(phy_node);
-
 	err = of_get_phy_mode(np);
 	if (err < 0) {
 		pdata = dev_get_platdata(&pdev->dev);
@@ -3558,10 +3953,6 @@ err_out_unregister_mdio:
 		of_phy_deregister_fixed_link(np);
 	mdiobus_free(bp->mii_bus);
 
-	/* Shutdown the PHY if there is a GPIO reset */
-	if (bp->reset_gpio)
-		gpiod_set_value(bp->reset_gpio, 0);
-
 err_out_free_netdev:
 	free_netdev(dev);
 
@@ -3592,10 +3983,6 @@ static int macb_remove(struct platform_device *pdev)
 		dev->phydev = NULL;
 		mdiobus_free(bp->mii_bus);
 
-		/* Shutdown the PHY if there is a GPIO reset */
-		if (bp->reset_gpio)
-			gpiod_set_value(bp->reset_gpio, 0);
-
 		unregister_netdev(dev);
 		clk_disable_unprepare(bp->tx_clk);
 		clk_disable_unprepare(bp->hclk);
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index a063c36c4c58..52b3a6044f85 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -65,6 +65,11 @@ module_param(cpi_alg, int, S_IRUGO);
 MODULE_PARM_DESC(cpi_alg,
 		 "PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)");
 
+struct nicvf_xdp_tx {
+	u64 dma_addr;
+	u8  qidx;
+};
+
 static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx)
 {
 	if (nic->sqs_mode)
@@ -500,14 +505,29 @@ static int nicvf_init_resources(struct nicvf *nic)
 	return 0;
 }
 
+static void nicvf_unmap_page(struct nicvf *nic, struct page *page, u64 dma_addr)
+{
+	/* Check if it's a recycled page, if not unmap the DMA mapping.
+	 * Recycled page holds an extra reference.
+	 */
+	if (page_ref_count(page) == 1) {
+		dma_addr &= PAGE_MASK;
+		dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
+				     RCV_FRAG_LEN + XDP_HEADROOM,
+				     DMA_FROM_DEVICE,
+				     DMA_ATTR_SKIP_CPU_SYNC);
+	}
+}
+
 static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 				struct cqe_rx_t *cqe_rx, struct snd_queue *sq,
 				struct sk_buff **skb)
 {
 	struct xdp_buff xdp;
 	struct page *page;
+	struct nicvf_xdp_tx *xdp_tx = NULL;
 	u32 action;
-	u16 len, offset = 0;
+	u16 len, err, offset = 0;
 	u64 dma_addr, cpu_addr;
 	void *orig_data;
 
@@ -521,7 +541,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 	cpu_addr = (u64)phys_to_virt(cpu_addr);
 	page = virt_to_page((void *)cpu_addr);
 
-	xdp.data_hard_start = page_address(page);
+	xdp.data_hard_start = page_address(page) + RCV_BUF_HEADROOM;
 	xdp.data = (void *)cpu_addr;
 	xdp_set_data_meta_invalid(&xdp);
 	xdp.data_end = xdp.data + len;
@@ -540,18 +560,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 
 	switch (action) {
 	case XDP_PASS:
-		/* Check if it's a recycled page, if not
-		 * unmap the DMA mapping.
-		 *
-		 * Recycled page holds an extra reference.
-		 */
-		if (page_ref_count(page) == 1) {
-			dma_addr &= PAGE_MASK;
-			dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
-					     RCV_FRAG_LEN + XDP_PACKET_HEADROOM,
-					     DMA_FROM_DEVICE,
-					     DMA_ATTR_SKIP_CPU_SYNC);
-		}
+		nicvf_unmap_page(nic, page, dma_addr);
 
 		/* Build SKB and pass on packet to network stack */
 		*skb = build_skb(xdp.data,
@@ -564,6 +573,20 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 	case XDP_TX:
 		nicvf_xdp_sq_append_pkt(nic, sq, (u64)xdp.data, dma_addr, len);
 		return true;
+	case XDP_REDIRECT:
+		/* Save DMA address for use while transmitting */
+		xdp_tx = (struct nicvf_xdp_tx *)page_address(page);
+		xdp_tx->dma_addr = dma_addr;
+		xdp_tx->qidx = nicvf_netdev_qidx(nic, cqe_rx->rq_idx);
+
+		err = xdp_do_redirect(nic->pnicvf->netdev, &xdp, prog);
+		if (!err)
+			return true;
+
+		/* Free the page on error */
+		nicvf_unmap_page(nic, page, dma_addr);
+		put_page(page);
+		break;
 	default:
 		bpf_warn_invalid_xdp_action(action);
 		/* fall through */
@@ -571,18 +594,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 		trace_xdp_exception(nic->netdev, prog, action);
 		/* fall through */
 	case XDP_DROP:
-		/* Check if it's a recycled page, if not
-		 * unmap the DMA mapping.
-		 *
-		 * Recycled page holds an extra reference.
-		 */
-		if (page_ref_count(page) == 1) {
-			dma_addr &= PAGE_MASK;
-			dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
-					     RCV_FRAG_LEN + XDP_PACKET_HEADROOM,
-					     DMA_FROM_DEVICE,
-					     DMA_ATTR_SKIP_CPU_SYNC);
-		}
+		nicvf_unmap_page(nic, page, dma_addr);
 		put_page(page);
 		return true;
 	}
@@ -1764,6 +1776,50 @@ static int nicvf_xdp(struct net_device *netdev, struct netdev_bpf *xdp)
 	}
 }
 
+static int nicvf_xdp_xmit(struct net_device *netdev, struct xdp_buff *xdp)
+{
+	struct nicvf *nic = netdev_priv(netdev);
+	struct nicvf *snic = nic;
+	struct nicvf_xdp_tx *xdp_tx;
+	struct snd_queue *sq;
+	struct page *page;
+	int err, qidx;
+
+	if (!netif_running(netdev) || !nic->xdp_prog)
+		return -EINVAL;
+
+	page = virt_to_page(xdp->data);
+	xdp_tx = (struct nicvf_xdp_tx *)page_address(page);
+	qidx = xdp_tx->qidx;
+
+	if (xdp_tx->qidx >= nic->xdp_tx_queues)
+		return -EINVAL;
+
+	/* Get secondary Qset's info */
+	if (xdp_tx->qidx >= MAX_SND_QUEUES_PER_QS) {
+		qidx = xdp_tx->qidx / MAX_SND_QUEUES_PER_QS;
+		snic = (struct nicvf *)nic->snicvf[qidx - 1];
+		if (!snic)
+			return -EINVAL;
+		qidx = xdp_tx->qidx % MAX_SND_QUEUES_PER_QS;
+	}
+
+	sq = &snic->qs->sq[qidx];
+	err = nicvf_xdp_sq_append_pkt(snic, sq, (u64)xdp->data,
+				      xdp_tx->dma_addr,
+				      xdp->data_end - xdp->data);
+	if (err)
+		return -ENOMEM;
+
+	nicvf_xdp_sq_doorbell(snic, sq, qidx);
+	return 0;
+}
+
+static void nicvf_xdp_flush(struct net_device *dev)
+{
+	return;
+}
+
 static const struct net_device_ops nicvf_netdev_ops = {
 	.ndo_open		= nicvf_open,
 	.ndo_stop		= nicvf_stop,
@@ -1775,6 +1831,8 @@ static const struct net_device_ops nicvf_netdev_ops = {
 	.ndo_fix_features       = nicvf_fix_features,
 	.ndo_set_features       = nicvf_set_features,
 	.ndo_bpf		= nicvf_xdp,
+	.ndo_xdp_xmit		= nicvf_xdp_xmit,
+	.ndo_xdp_flush          = nicvf_xdp_flush,
 };
 
 static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
@@ -1833,6 +1891,11 @@ static int nicvf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	nic->pdev = pdev;
 	nic->pnicvf = nic;
 	nic->max_queues = qcount;
+	/* If no of CPUs are too low, there won't be any queues left
+	 * for XDP_TX, hence double it.
+	 */
+	if (!nic->t88)
+		nic->max_queues *= 2;
 
 	/* MAP VF's configuration registers */
 	nic->reg_base = pcim_iomap(pdev, PCI_CFG_REG_BAR_NUM, 0);
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
index a3d12dbde95b..f38ea349aa00 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c
@@ -204,7 +204,7 @@ static inline int nicvf_alloc_rcv_buffer(struct nicvf *nic, struct rbdr *rbdr,
 
 	/* Reserve space for header modifications by BPF program */
 	if (rbdr->is_xdp)
-		buf_len += XDP_PACKET_HEADROOM;
+		buf_len += XDP_HEADROOM;
 
 	/* Check if it's recycled */
 	if (pgcache)
@@ -224,8 +224,9 @@ ret:
 			nic->rb_page = NULL;
 			return -ENOMEM;
 		}
+
 		if (pgcache)
-			pgcache->dma_addr = *rbuf + XDP_PACKET_HEADROOM;
+			pgcache->dma_addr = *rbuf + XDP_HEADROOM;
 		nic->rb_page_offset += buf_len;
 	}
 
@@ -1236,7 +1237,7 @@ int nicvf_xdp_sq_append_pkt(struct nicvf *nic, struct snd_queue *sq,
 	int qentry;
 
 	if (subdesc_cnt > sq->xdp_free_cnt)
-		return 0;
+		return -1;
 
 	qentry = nicvf_get_sq_desc(sq, subdesc_cnt);
 
@@ -1247,7 +1248,7 @@ int nicvf_xdp_sq_append_pkt(struct nicvf *nic, struct snd_queue *sq,
 
 	sq->xdp_desc_cnt += subdesc_cnt;
 
-	return 1;
+	return 0;
 }
 
 /* Calculate no of SQ subdescriptors needed to transmit all
@@ -1625,7 +1626,7 @@ static void nicvf_unmap_rcv_buffer(struct nicvf *nic, u64 dma_addr,
 		if (page_ref_count(page) != 1)
 			return;
 
-		len += XDP_PACKET_HEADROOM;
+		len += XDP_HEADROOM;
 		/* Receive buffers in XDP mode are mapped from page start */
 		dma_addr &= PAGE_MASK;
 	}
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
index 67d1a3230773..178ab6e8e3c5 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.h
@@ -11,6 +11,7 @@
 
 #include <linux/netdevice.h>
 #include <linux/iommu.h>
+#include <linux/bpf.h>
 #include "q_struct.h"
 
 #define MAX_QUEUE_SET			128
@@ -92,6 +93,9 @@
 #define RCV_FRAG_LEN	 (SKB_DATA_ALIGN(DMA_BUFFER_LEN + NET_SKB_PAD) + \
 			 SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
 
+#define RCV_BUF_HEADROOM	128 /* To store dma address for XDP redirect */
+#define XDP_HEADROOM		(XDP_PACKET_HEADROOM + RCV_BUF_HEADROOM)
+
 #define MAX_CQES_FOR_TX		((SND_QUEUE_LEN / MIN_SQ_DESC_PER_PKT_XMIT) * \
 				 MAX_CQE_PER_PKT_XMIT)
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h b/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
index 605689957496..2e71e334d819 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_entity.h
@@ -18,17 +18,15 @@
 #ifndef __CUDBG_ENTITY_H__
 #define __CUDBG_ENTITY_H__
 
-#define EDC0_FLAG 3
-#define EDC1_FLAG 4
+#define EDC0_FLAG 0
+#define EDC1_FLAG 1
+#define MC_FLAG 2
+#define MC0_FLAG 3
+#define MC1_FLAG 4
+#define HMA_FLAG 5
 
 #define CUDBG_ENTITY_SIGNATURE 0xCCEDB001
 
-struct card_mem {
-	u16 size_edc0;
-	u16 size_edc1;
-	u16 mem_flag;
-};
-
 struct cudbg_mbox_log {
 	struct mbox_cmd entry;
 	u32 hi[MBOX_LEN / 8];
@@ -87,6 +85,48 @@ struct cudbg_tp_la {
 	u8 data[0];
 };
 
+static const char * const cudbg_region[] = {
+	"DBQ contexts:", "IMSG contexts:", "FLM cache:", "TCBs:",
+	"Pstructs:", "Timers:", "Rx FL:", "Tx FL:", "Pstruct FL:",
+	"Tx payload:", "Rx payload:", "LE hash:", "iSCSI region:",
+	"TDDP region:", "TPT region:", "STAG region:", "RQ region:",
+	"RQUDP region:", "PBL region:", "TXPBL region:",
+	"DBVFIFO region:", "ULPRX state:", "ULPTX state:",
+	"On-chip queues:"
+};
+
+/* Memory region info relative to current memory (i.e. wrt 0). */
+struct cudbg_region_info {
+	bool exist; /* Does region exists in current memory? */
+	u32 start;  /* Start wrt 0 */
+	u32 end;    /* End wrt 0 */
+};
+
+struct cudbg_mem_desc {
+	u32 base;
+	u32 limit;
+	u32 idx;
+};
+
+struct cudbg_meminfo {
+	struct cudbg_mem_desc avail[4];
+	struct cudbg_mem_desc mem[ARRAY_SIZE(cudbg_region) + 3];
+	u32 avail_c;
+	u32 mem_c;
+	u32 up_ram_lo;
+	u32 up_ram_hi;
+	u32 up_extmem2_lo;
+	u32 up_extmem2_hi;
+	u32 rx_pages_data[3];
+	u32 tx_pages_data[4];
+	u32 p_structs;
+	u32 reserved[12];
+	u32 port_used[4];
+	u32 port_alloc[4];
+	u32 loopback_used[NCHAN];
+	u32 loopback_alloc[NCHAN];
+};
+
 struct cudbg_cim_pif_la {
 	int size;
 	u8 data[0];
@@ -145,6 +185,7 @@ struct cudbg_tid_info_region_rev1 {
 	u32 reserved[16];
 };
 
+#define CUDBG_LOWMEM_MAX_CTXT_QIDS 256
 #define CUDBG_MAX_FL_QIDS 1024
 
 struct cudbg_ch_cntxt {
@@ -334,6 +375,25 @@ static const u32 t5_pm_tx_array[][IREG_NUM_ELEM] = {
 	{0x8FF0, 0x8FF4, 0x10021, 0x1D}, /* t5_pm_tx_regs_10021_to_1003c */
 };
 
+#define CUDBG_NUM_PCIE_CONFIG_REGS 0x61
+
+static const u32 t5_pcie_config_array[][2] = {
+	{0x0, 0x34},
+	{0x3c, 0x40},
+	{0x50, 0x64},
+	{0x70, 0x80},
+	{0x94, 0xa0},
+	{0xb0, 0xb8},
+	{0xd0, 0xd4},
+	{0x100, 0x128},
+	{0x140, 0x148},
+	{0x150, 0x164},
+	{0x170, 0x178},
+	{0x180, 0x194},
+	{0x1a0, 0x1b8},
+	{0x1c0, 0x208},
+};
+
 static const u32 t6_ma_ireg_array[][IREG_NUM_ELEM] = {
 	{0x78f8, 0x78fc, 0xa000, 23}, /* t6_ma_regs_a000_to_a016 */
 	{0x78f8, 0x78fc, 0xa400, 30}, /* t6_ma_regs_a400_to_a41e */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h b/drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h
index e10ff1ee62c5..e8173ae32158 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_if.h
@@ -47,6 +47,8 @@ enum cudbg_dbg_entity_type {
 	CUDBG_CIM_OBQ_NCSI = 17,
 	CUDBG_EDC0 = 18,
 	CUDBG_EDC1 = 19,
+	CUDBG_MC0 = 20,
+	CUDBG_MC1 = 21,
 	CUDBG_RSS = 22,
 	CUDBG_RSS_VF_CONF = 25,
 	CUDBG_PATH_MTU = 27,
@@ -56,6 +58,7 @@ enum cudbg_dbg_entity_type {
 	CUDBG_SGE_INDIRECT = 37,
 	CUDBG_ULPRX_LA = 41,
 	CUDBG_TP_LA = 43,
+	CUDBG_MEMINFO = 44,
 	CUDBG_CIM_PIF_LA = 45,
 	CUDBG_CLK = 46,
 	CUDBG_CIM_OBQ_RXQ0 = 47,
@@ -63,6 +66,7 @@ enum cudbg_dbg_entity_type {
 	CUDBG_PCIE_INDIRECT = 50,
 	CUDBG_PM_INDIRECT = 51,
 	CUDBG_TID_INFO = 54,
+	CUDBG_PCIE_CONFIG = 55,
 	CUDBG_DUMP_CONTEXT = 56,
 	CUDBG_MPS_TCAM = 57,
 	CUDBG_VPD_DATA = 58,
@@ -74,6 +78,7 @@ enum cudbg_dbg_entity_type {
 	CUDBG_PBT_TABLE = 65,
 	CUDBG_MBOX_LOG = 66,
 	CUDBG_HMA_INDIRECT = 67,
+	CUDBG_HMA = 68,
 	CUDBG_MAX_ENTITY = 70,
 };
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
index d699bf88d18f..d73fb6a85f8e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c
@@ -15,12 +15,14 @@
  *
  */
 
+#include <linux/sort.h>
+
 #include "t4_regs.h"
 #include "cxgb4.h"
 #include "cudbg_if.h"
 #include "cudbg_lib_common.h"
-#include "cudbg_lib.h"
 #include "cudbg_entity.h"
+#include "cudbg_lib.h"
 
 static void cudbg_write_and_release_buff(struct cudbg_buffer *pin_buff,
 					 struct cudbg_buffer *dbg_buff)
@@ -84,6 +86,277 @@ static int cudbg_read_vpd_reg(struct adapter *padap, u32 addr, u32 len,
 	return 0;
 }
 
+static int cudbg_mem_desc_cmp(const void *a, const void *b)
+{
+	return ((const struct cudbg_mem_desc *)a)->base -
+	       ((const struct cudbg_mem_desc *)b)->base;
+}
+
+int cudbg_fill_meminfo(struct adapter *padap,
+		       struct cudbg_meminfo *meminfo_buff)
+{
+	struct cudbg_mem_desc *md;
+	u32 lo, hi, used, alloc;
+	int n, i;
+
+	memset(meminfo_buff->avail, 0,
+	       ARRAY_SIZE(meminfo_buff->avail) *
+	       sizeof(struct cudbg_mem_desc));
+	memset(meminfo_buff->mem, 0,
+	       (ARRAY_SIZE(cudbg_region) + 3) * sizeof(struct cudbg_mem_desc));
+	md  = meminfo_buff->mem;
+
+	for (i = 0; i < ARRAY_SIZE(meminfo_buff->mem); i++) {
+		meminfo_buff->mem[i].limit = 0;
+		meminfo_buff->mem[i].idx = i;
+	}
+
+	/* Find and sort the populated memory ranges */
+	i = 0;
+	lo = t4_read_reg(padap, MA_TARGET_MEM_ENABLE_A);
+	if (lo & EDRAM0_ENABLE_F) {
+		hi = t4_read_reg(padap, MA_EDRAM0_BAR_A);
+		meminfo_buff->avail[i].base =
+			cudbg_mbytes_to_bytes(EDRAM0_BASE_G(hi));
+		meminfo_buff->avail[i].limit =
+			meminfo_buff->avail[i].base +
+			cudbg_mbytes_to_bytes(EDRAM0_SIZE_G(hi));
+		meminfo_buff->avail[i].idx = 0;
+		i++;
+	}
+
+	if (lo & EDRAM1_ENABLE_F) {
+		hi =  t4_read_reg(padap, MA_EDRAM1_BAR_A);
+		meminfo_buff->avail[i].base =
+			cudbg_mbytes_to_bytes(EDRAM1_BASE_G(hi));
+		meminfo_buff->avail[i].limit =
+			meminfo_buff->avail[i].base +
+			cudbg_mbytes_to_bytes(EDRAM1_SIZE_G(hi));
+		meminfo_buff->avail[i].idx = 1;
+		i++;
+	}
+
+	if (is_t5(padap->params.chip)) {
+		if (lo & EXT_MEM0_ENABLE_F) {
+			hi = t4_read_reg(padap, MA_EXT_MEMORY0_BAR_A);
+			meminfo_buff->avail[i].base =
+				cudbg_mbytes_to_bytes(EXT_MEM_BASE_G(hi));
+			meminfo_buff->avail[i].limit =
+				meminfo_buff->avail[i].base +
+				cudbg_mbytes_to_bytes(EXT_MEM_SIZE_G(hi));
+			meminfo_buff->avail[i].idx = 3;
+			i++;
+		}
+
+		if (lo & EXT_MEM1_ENABLE_F) {
+			hi = t4_read_reg(padap, MA_EXT_MEMORY1_BAR_A);
+			meminfo_buff->avail[i].base =
+				cudbg_mbytes_to_bytes(EXT_MEM1_BASE_G(hi));
+			meminfo_buff->avail[i].limit =
+				meminfo_buff->avail[i].base +
+				cudbg_mbytes_to_bytes(EXT_MEM1_SIZE_G(hi));
+			meminfo_buff->avail[i].idx = 4;
+			i++;
+		}
+	} else {
+		if (lo & EXT_MEM_ENABLE_F) {
+			hi = t4_read_reg(padap, MA_EXT_MEMORY_BAR_A);
+			meminfo_buff->avail[i].base =
+				cudbg_mbytes_to_bytes(EXT_MEM_BASE_G(hi));
+			meminfo_buff->avail[i].limit =
+				meminfo_buff->avail[i].base +
+				cudbg_mbytes_to_bytes(EXT_MEM_SIZE_G(hi));
+			meminfo_buff->avail[i].idx = 2;
+			i++;
+		}
+
+		if (lo & HMA_MUX_F) {
+			hi = t4_read_reg(padap, MA_EXT_MEMORY1_BAR_A);
+			meminfo_buff->avail[i].base =
+				cudbg_mbytes_to_bytes(EXT_MEM1_BASE_G(hi));
+			meminfo_buff->avail[i].limit =
+				meminfo_buff->avail[i].base +
+				cudbg_mbytes_to_bytes(EXT_MEM1_SIZE_G(hi));
+			meminfo_buff->avail[i].idx = 5;
+			i++;
+		}
+	}
+
+	if (!i) /* no memory available */
+		return CUDBG_STATUS_ENTITY_NOT_FOUND;
+
+	meminfo_buff->avail_c = i;
+	sort(meminfo_buff->avail, i, sizeof(struct cudbg_mem_desc),
+	     cudbg_mem_desc_cmp, NULL);
+	(md++)->base = t4_read_reg(padap, SGE_DBQ_CTXT_BADDR_A);
+	(md++)->base = t4_read_reg(padap, SGE_IMSG_CTXT_BADDR_A);
+	(md++)->base = t4_read_reg(padap, SGE_FLM_CACHE_BADDR_A);
+	(md++)->base = t4_read_reg(padap, TP_CMM_TCB_BASE_A);
+	(md++)->base = t4_read_reg(padap, TP_CMM_MM_BASE_A);
+	(md++)->base = t4_read_reg(padap, TP_CMM_TIMER_BASE_A);
+	(md++)->base = t4_read_reg(padap, TP_CMM_MM_RX_FLST_BASE_A);
+	(md++)->base = t4_read_reg(padap, TP_CMM_MM_TX_FLST_BASE_A);
+	(md++)->base = t4_read_reg(padap, TP_CMM_MM_PS_FLST_BASE_A);
+
+	/* the next few have explicit upper bounds */
+	md->base = t4_read_reg(padap, TP_PMM_TX_BASE_A);
+	md->limit = md->base - 1 +
+		    t4_read_reg(padap, TP_PMM_TX_PAGE_SIZE_A) *
+		    PMTXMAXPAGE_G(t4_read_reg(padap, TP_PMM_TX_MAX_PAGE_A));
+	md++;
+
+	md->base = t4_read_reg(padap, TP_PMM_RX_BASE_A);
+	md->limit = md->base - 1 +
+		    t4_read_reg(padap, TP_PMM_RX_PAGE_SIZE_A) *
+		    PMRXMAXPAGE_G(t4_read_reg(padap, TP_PMM_RX_MAX_PAGE_A));
+	md++;
+
+	if (t4_read_reg(padap, LE_DB_CONFIG_A) & HASHEN_F) {
+		if (CHELSIO_CHIP_VERSION(padap->params.chip) <= CHELSIO_T5) {
+			hi = t4_read_reg(padap, LE_DB_TID_HASHBASE_A) / 4;
+			md->base = t4_read_reg(padap, LE_DB_HASH_TID_BASE_A);
+		} else {
+			hi = t4_read_reg(padap, LE_DB_HASH_TID_BASE_A);
+			md->base = t4_read_reg(padap,
+					       LE_DB_HASH_TBL_BASE_ADDR_A);
+		}
+		md->limit = 0;
+	} else {
+		md->base = 0;
+		md->idx = ARRAY_SIZE(cudbg_region);  /* hide it */
+	}
+	md++;
+
+#define ulp_region(reg) do { \
+	md->base = t4_read_reg(padap, ULP_ ## reg ## _LLIMIT_A);\
+	(md++)->limit = t4_read_reg(padap, ULP_ ## reg ## _ULIMIT_A);\
+} while (0)
+
+	ulp_region(RX_ISCSI);
+	ulp_region(RX_TDDP);
+	ulp_region(TX_TPT);
+	ulp_region(RX_STAG);
+	ulp_region(RX_RQ);
+	ulp_region(RX_RQUDP);
+	ulp_region(RX_PBL);
+	ulp_region(TX_PBL);
+#undef ulp_region
+	md->base = 0;
+	md->idx = ARRAY_SIZE(cudbg_region);
+	if (!is_t4(padap->params.chip)) {
+		u32 fifo_size = t4_read_reg(padap, SGE_DBVFIFO_SIZE_A);
+		u32 sge_ctrl = t4_read_reg(padap, SGE_CONTROL2_A);
+		u32 size = 0;
+
+		if (is_t5(padap->params.chip)) {
+			if (sge_ctrl & VFIFO_ENABLE_F)
+				size = DBVFIFO_SIZE_G(fifo_size);
+		} else {
+			size = T6_DBVFIFO_SIZE_G(fifo_size);
+		}
+
+		if (size) {
+			md->base = BASEADDR_G(t4_read_reg(padap,
+							  SGE_DBVFIFO_BADDR_A));
+			md->limit = md->base + (size << 2) - 1;
+		}
+	}
+
+	md++;
+
+	md->base = t4_read_reg(padap, ULP_RX_CTX_BASE_A);
+	md->limit = 0;
+	md++;
+	md->base = t4_read_reg(padap, ULP_TX_ERR_TABLE_BASE_A);
+	md->limit = 0;
+	md++;
+
+	md->base = padap->vres.ocq.start;
+	if (padap->vres.ocq.size)
+		md->limit = md->base + padap->vres.ocq.size - 1;
+	else
+		md->idx = ARRAY_SIZE(cudbg_region);  /* hide it */
+	md++;
+
+	/* add any address-space holes, there can be up to 3 */
+	for (n = 0; n < i - 1; n++)
+		if (meminfo_buff->avail[n].limit <
+		    meminfo_buff->avail[n + 1].base)
+			(md++)->base = meminfo_buff->avail[n].limit;
+
+	if (meminfo_buff->avail[n].limit)
+		(md++)->base = meminfo_buff->avail[n].limit;
+
+	n = md - meminfo_buff->mem;
+	meminfo_buff->mem_c = n;
+
+	sort(meminfo_buff->mem, n, sizeof(struct cudbg_mem_desc),
+	     cudbg_mem_desc_cmp, NULL);
+
+	lo = t4_read_reg(padap, CIM_SDRAM_BASE_ADDR_A);
+	hi = t4_read_reg(padap, CIM_SDRAM_ADDR_SIZE_A) + lo - 1;
+	meminfo_buff->up_ram_lo = lo;
+	meminfo_buff->up_ram_hi = hi;
+
+	lo = t4_read_reg(padap, CIM_EXTMEM2_BASE_ADDR_A);
+	hi = t4_read_reg(padap, CIM_EXTMEM2_ADDR_SIZE_A) + lo - 1;
+	meminfo_buff->up_extmem2_lo = lo;
+	meminfo_buff->up_extmem2_hi = hi;
+
+	lo = t4_read_reg(padap, TP_PMM_RX_MAX_PAGE_A);
+	meminfo_buff->rx_pages_data[0] =  PMRXMAXPAGE_G(lo);
+	meminfo_buff->rx_pages_data[1] =
+		t4_read_reg(padap, TP_PMM_RX_PAGE_SIZE_A) >> 10;
+	meminfo_buff->rx_pages_data[2] = (lo & PMRXNUMCHN_F) ? 2 : 1;
+
+	lo = t4_read_reg(padap, TP_PMM_TX_MAX_PAGE_A);
+	hi = t4_read_reg(padap, TP_PMM_TX_PAGE_SIZE_A);
+	meminfo_buff->tx_pages_data[0] = PMTXMAXPAGE_G(lo);
+	meminfo_buff->tx_pages_data[1] =
+		hi >= (1 << 20) ? (hi >> 20) : (hi >> 10);
+	meminfo_buff->tx_pages_data[2] =
+		hi >= (1 << 20) ? 'M' : 'K';
+	meminfo_buff->tx_pages_data[3] = 1 << PMTXNUMCHN_G(lo);
+
+	meminfo_buff->p_structs = t4_read_reg(padap, TP_CMM_MM_MAX_PSTRUCT_A);
+
+	for (i = 0; i < 4; i++) {
+		if (CHELSIO_CHIP_VERSION(padap->params.chip) > CHELSIO_T5)
+			lo = t4_read_reg(padap,
+					 MPS_RX_MAC_BG_PG_CNT0_A + i * 4);
+		else
+			lo = t4_read_reg(padap, MPS_RX_PG_RSV0_A + i * 4);
+		if (is_t5(padap->params.chip)) {
+			used = T5_USED_G(lo);
+			alloc = T5_ALLOC_G(lo);
+		} else {
+			used = USED_G(lo);
+			alloc = ALLOC_G(lo);
+		}
+		meminfo_buff->port_used[i] = used;
+		meminfo_buff->port_alloc[i] = alloc;
+	}
+
+	for (i = 0; i < padap->params.arch.nchan; i++) {
+		if (CHELSIO_CHIP_VERSION(padap->params.chip) > CHELSIO_T5)
+			lo = t4_read_reg(padap,
+					 MPS_RX_LPBK_BG_PG_CNT0_A + i * 4);
+		else
+			lo = t4_read_reg(padap, MPS_RX_PG_RSV4_A + i * 4);
+		if (is_t5(padap->params.chip)) {
+			used = T5_USED_G(lo);
+			alloc = T5_ALLOC_G(lo);
+		} else {
+			used = USED_G(lo);
+			alloc = ALLOC_G(lo);
+		}
+		meminfo_buff->loopback_used[i] = used;
+		meminfo_buff->loopback_alloc[i] = alloc;
+	}
+
+	return 0;
+}
+
 int cudbg_collect_reg_dump(struct cudbg_init *pdbg_init,
 			   struct cudbg_buffer *dbg_buff,
 			   struct cudbg_error *cudbg_err)
@@ -420,23 +693,211 @@ int cudbg_collect_obq_sge_rx_q1(struct cudbg_init *pdbg_init,
 	return cudbg_read_cim_obq(pdbg_init, dbg_buff, cudbg_err, 7);
 }
 
+static int cudbg_meminfo_get_mem_index(struct adapter *padap,
+				       struct cudbg_meminfo *mem_info,
+				       u8 mem_type, u8 *idx)
+{
+	u8 i, flag;
+
+	switch (mem_type) {
+	case MEM_EDC0:
+		flag = EDC0_FLAG;
+		break;
+	case MEM_EDC1:
+		flag = EDC1_FLAG;
+		break;
+	case MEM_MC0:
+		/* Some T5 cards have both MC0 and MC1. */
+		flag = is_t5(padap->params.chip) ? MC0_FLAG : MC_FLAG;
+		break;
+	case MEM_MC1:
+		flag = MC1_FLAG;
+		break;
+	case MEM_HMA:
+		flag = HMA_FLAG;
+		break;
+	default:
+		return CUDBG_STATUS_ENTITY_NOT_FOUND;
+	}
+
+	for (i = 0; i < mem_info->avail_c; i++) {
+		if (mem_info->avail[i].idx == flag) {
+			*idx = i;
+			return 0;
+		}
+	}
+
+	return CUDBG_STATUS_ENTITY_NOT_FOUND;
+}
+
+/* Fetch the @region_name's start and end from @meminfo. */
+static int cudbg_get_mem_region(struct adapter *padap,
+				struct cudbg_meminfo *meminfo,
+				u8 mem_type, const char *region_name,
+				struct cudbg_mem_desc *mem_desc)
+{
+	u8 mc, found = 0;
+	u32 i, idx = 0;
+	int rc;
+
+	rc = cudbg_meminfo_get_mem_index(padap, meminfo, mem_type, &mc);
+	if (rc)
+		return rc;
+
+	for (i = 0; i < ARRAY_SIZE(cudbg_region); i++) {
+		if (!strcmp(cudbg_region[i], region_name)) {
+			found = 1;
+			idx = i;
+			break;
+		}
+	}
+	if (!found)
+		return -EINVAL;
+
+	found = 0;
+	for (i = 0; i < meminfo->mem_c; i++) {
+		if (meminfo->mem[i].idx >= ARRAY_SIZE(cudbg_region))
+			continue; /* Skip holes */
+
+		if (!(meminfo->mem[i].limit))
+			meminfo->mem[i].limit =
+				i < meminfo->mem_c - 1 ?
+				meminfo->mem[i + 1].base - 1 : ~0;
+
+		if (meminfo->mem[i].idx == idx) {
+			/* Check if the region exists in @mem_type memory */
+			if (meminfo->mem[i].base < meminfo->avail[mc].base &&
+			    meminfo->mem[i].limit < meminfo->avail[mc].base)
+				return -EINVAL;
+
+			if (meminfo->mem[i].base > meminfo->avail[mc].limit)
+				return -EINVAL;
+
+			memcpy(mem_desc, &meminfo->mem[i],
+			       sizeof(struct cudbg_mem_desc));
+			found = 1;
+			break;
+		}
+	}
+	if (!found)
+		return -EINVAL;
+
+	return 0;
+}
+
+/* Fetch and update the start and end of the requested memory region w.r.t 0
+ * in the corresponding EDC/MC/HMA.
+ */
+static int cudbg_get_mem_relative(struct adapter *padap,
+				  struct cudbg_meminfo *meminfo,
+				  u8 mem_type, u32 *out_base, u32 *out_end)
+{
+	u8 mc_idx;
+	int rc;
+
+	rc = cudbg_meminfo_get_mem_index(padap, meminfo, mem_type, &mc_idx);
+	if (rc)
+		return rc;
+
+	if (*out_base < meminfo->avail[mc_idx].base)
+		*out_base = 0;
+	else
+		*out_base -= meminfo->avail[mc_idx].base;
+
+	if (*out_end > meminfo->avail[mc_idx].limit)
+		*out_end = meminfo->avail[mc_idx].limit;
+	else
+		*out_end -= meminfo->avail[mc_idx].base;
+
+	return 0;
+}
+
+/* Get TX and RX Payload region */
+static int cudbg_get_payload_range(struct adapter *padap, u8 mem_type,
+				   const char *region_name,
+				   struct cudbg_region_info *payload)
+{
+	struct cudbg_mem_desc mem_desc = { 0 };
+	struct cudbg_meminfo meminfo;
+	int rc;
+
+	rc = cudbg_fill_meminfo(padap, &meminfo);
+	if (rc)
+		return rc;
+
+	rc = cudbg_get_mem_region(padap, &meminfo, mem_type, region_name,
+				  &mem_desc);
+	if (rc) {
+		payload->exist = false;
+		return 0;
+	}
+
+	payload->exist = true;
+	payload->start = mem_desc.base;
+	payload->end = mem_desc.limit;
+
+	return cudbg_get_mem_relative(padap, &meminfo, mem_type,
+				      &payload->start, &payload->end);
+}
+
+#define CUDBG_YIELD_ITERATION 256
+
 static int cudbg_read_fw_mem(struct cudbg_init *pdbg_init,
 			     struct cudbg_buffer *dbg_buff, u8 mem_type,
 			     unsigned long tot_len,
 			     struct cudbg_error *cudbg_err)
 {
+	static const char * const region_name[] = { "Tx payload:",
+						    "Rx payload:" };
 	unsigned long bytes, bytes_left, bytes_read = 0;
 	struct adapter *padap = pdbg_init->adap;
 	struct cudbg_buffer temp_buff = { 0 };
+	struct cudbg_region_info payload[2];
+	u32 yield_count = 0;
 	int rc = 0;
+	u8 i;
+
+	/* Get TX/RX Payload region range if they exist */
+	memset(payload, 0, sizeof(payload));
+	for (i = 0; i < ARRAY_SIZE(region_name); i++) {
+		rc = cudbg_get_payload_range(padap, mem_type, region_name[i],
+					     &payload[i]);
+		if (rc)
+			return rc;
+
+		if (payload[i].exist) {
+			/* Align start and end to avoid wrap around */
+			payload[i].start = roundup(payload[i].start,
+						   CUDBG_CHUNK_SIZE);
+			payload[i].end = rounddown(payload[i].end,
+						   CUDBG_CHUNK_SIZE);
+		}
+	}
 
 	bytes_left = tot_len;
 	while (bytes_left > 0) {
+		/* As MC size is huge and read through PIO access, this
+		 * loop will hold cpu for a longer time. OS may think that
+		 * the process is hanged and will generate CPU stall traces.
+		 * So yield the cpu regularly.
+		 */
+		yield_count++;
+		if (!(yield_count % CUDBG_YIELD_ITERATION))
+			schedule();
+
 		bytes = min_t(unsigned long, bytes_left,
 			      (unsigned long)CUDBG_CHUNK_SIZE);
 		rc = cudbg_get_buff(dbg_buff, bytes, &temp_buff);
 		if (rc)
 			return rc;
+
+		for (i = 0; i < ARRAY_SIZE(payload); i++)
+			if (payload[i].exist &&
+			    bytes_read >= payload[i].start &&
+			    bytes_read + bytes <= payload[i].end)
+				/* TX and RX Payload regions can't overlap */
+				goto skip_read;
+
 		spin_lock(&padap->win0_lock);
 		rc = t4_memory_rw(padap, MEMWIN_NIC, mem_type,
 				  bytes_read, bytes,
@@ -448,6 +909,8 @@ static int cudbg_read_fw_mem(struct cudbg_init *pdbg_init,
 			cudbg_put_buff(&temp_buff, dbg_buff);
 			return rc;
 		}
+
+skip_read:
 		bytes_left -= bytes;
 		bytes_read += bytes;
 		cudbg_write_and_release_buff(&temp_buff, dbg_buff);
@@ -455,27 +918,6 @@ static int cudbg_read_fw_mem(struct cudbg_init *pdbg_init,
 	return rc;
 }
 
-static void cudbg_collect_mem_info(struct cudbg_init *pdbg_init,
-				   struct card_mem *mem_info)
-{
-	struct adapter *padap = pdbg_init->adap;
-	u32 value;
-
-	value = t4_read_reg(padap, MA_EDRAM0_BAR_A);
-	value = EDRAM0_SIZE_G(value);
-	mem_info->size_edc0 = (u16)value;
-
-	value = t4_read_reg(padap, MA_EDRAM1_BAR_A);
-	value = EDRAM1_SIZE_G(value);
-	mem_info->size_edc1 = (u16)value;
-
-	value = t4_read_reg(padap, MA_TARGET_MEM_ENABLE_A);
-	if (value & EDRAM0_ENABLE_F)
-		mem_info->mem_flag |= (1 << EDC0_FLAG);
-	if (value & EDRAM1_ENABLE_F)
-		mem_info->mem_flag |= (1 << EDC1_FLAG);
-}
-
 static void cudbg_t4_fwcache(struct cudbg_init *pdbg_init,
 			     struct cudbg_error *cudbg_err)
 {
@@ -495,37 +937,25 @@ static int cudbg_collect_mem_region(struct cudbg_init *pdbg_init,
 				    struct cudbg_error *cudbg_err,
 				    u8 mem_type)
 {
-	struct card_mem mem_info = {0};
-	unsigned long flag, size;
+	struct adapter *padap = pdbg_init->adap;
+	struct cudbg_meminfo mem_info;
+	unsigned long size;
+	u8 mc_idx;
 	int rc;
 
+	memset(&mem_info, 0, sizeof(struct cudbg_meminfo));
+	rc = cudbg_fill_meminfo(padap, &mem_info);
+	if (rc)
+		return rc;
+
 	cudbg_t4_fwcache(pdbg_init, cudbg_err);
-	cudbg_collect_mem_info(pdbg_init, &mem_info);
-	switch (mem_type) {
-	case MEM_EDC0:
-		flag = (1 << EDC0_FLAG);
-		size = cudbg_mbytes_to_bytes(mem_info.size_edc0);
-		break;
-	case MEM_EDC1:
-		flag = (1 << EDC1_FLAG);
-		size = cudbg_mbytes_to_bytes(mem_info.size_edc1);
-		break;
-	default:
-		rc = CUDBG_STATUS_ENTITY_NOT_FOUND;
-		goto err;
-	}
+	rc = cudbg_meminfo_get_mem_index(padap, &mem_info, mem_type, &mc_idx);
+	if (rc)
+		return rc;
 
-	if (mem_info.mem_flag & flag) {
-		rc = cudbg_read_fw_mem(pdbg_init, dbg_buff, mem_type,
-				       size, cudbg_err);
-		if (rc)
-			goto err;
-	} else {
-		rc = CUDBG_STATUS_ENTITY_NOT_FOUND;
-		goto err;
-	}
-err:
-	return rc;
+	size = mem_info.avail[mc_idx].limit - mem_info.avail[mc_idx].base;
+	return cudbg_read_fw_mem(pdbg_init, dbg_buff, mem_type, size,
+				 cudbg_err);
 }
 
 int cudbg_collect_edc0_meminfo(struct cudbg_init *pdbg_init,
@@ -544,6 +974,30 @@ int cudbg_collect_edc1_meminfo(struct cudbg_init *pdbg_init,
 					MEM_EDC1);
 }
 
+int cudbg_collect_mc0_meminfo(struct cudbg_init *pdbg_init,
+			      struct cudbg_buffer *dbg_buff,
+			      struct cudbg_error *cudbg_err)
+{
+	return cudbg_collect_mem_region(pdbg_init, dbg_buff, cudbg_err,
+					MEM_MC0);
+}
+
+int cudbg_collect_mc1_meminfo(struct cudbg_init *pdbg_init,
+			      struct cudbg_buffer *dbg_buff,
+			      struct cudbg_error *cudbg_err)
+{
+	return cudbg_collect_mem_region(pdbg_init, dbg_buff, cudbg_err,
+					MEM_MC1);
+}
+
+int cudbg_collect_hma_meminfo(struct cudbg_init *pdbg_init,
+			      struct cudbg_buffer *dbg_buff,
+			      struct cudbg_error *cudbg_err)
+{
+	return cudbg_collect_mem_region(pdbg_init, dbg_buff, cudbg_err,
+					MEM_HMA);
+}
+
 int cudbg_collect_rss(struct cudbg_init *pdbg_init,
 		      struct cudbg_buffer *dbg_buff,
 		      struct cudbg_error *cudbg_err)
@@ -843,6 +1297,31 @@ int cudbg_collect_tp_la(struct cudbg_init *pdbg_init,
 	return rc;
 }
 
+int cudbg_collect_meminfo(struct cudbg_init *pdbg_init,
+			  struct cudbg_buffer *dbg_buff,
+			  struct cudbg_error *cudbg_err)
+{
+	struct adapter *padap = pdbg_init->adap;
+	struct cudbg_buffer temp_buff = { 0 };
+	struct cudbg_meminfo *meminfo_buff;
+	int rc;
+
+	rc = cudbg_get_buff(dbg_buff, sizeof(struct cudbg_meminfo), &temp_buff);
+	if (rc)
+		return rc;
+
+	meminfo_buff = (struct cudbg_meminfo *)temp_buff.data;
+	rc = cudbg_fill_meminfo(padap, meminfo_buff);
+	if (rc) {
+		cudbg_err->sys_err = rc;
+		cudbg_put_buff(&temp_buff, dbg_buff);
+		return rc;
+	}
+
+	cudbg_write_and_release_buff(&temp_buff, dbg_buff);
+	return rc;
+}
+
 int cudbg_collect_cim_pif_la(struct cudbg_init *pdbg_init,
 			     struct cudbg_buffer *dbg_buff,
 			     struct cudbg_error *cudbg_err)
@@ -1115,22 +1594,135 @@ int cudbg_collect_tid(struct cudbg_init *pdbg_init,
 	return rc;
 }
 
-int cudbg_dump_context_size(struct adapter *padap)
+int cudbg_collect_pcie_config(struct cudbg_init *pdbg_init,
+			      struct cudbg_buffer *dbg_buff,
+			      struct cudbg_error *cudbg_err)
 {
-	u32 value, size;
+	struct adapter *padap = pdbg_init->adap;
+	struct cudbg_buffer temp_buff = { 0 };
+	u32 size, *value, j;
+	int i, rc, n;
+
+	size = sizeof(u32) * CUDBG_NUM_PCIE_CONFIG_REGS;
+	n = sizeof(t5_pcie_config_array) / (2 * sizeof(u32));
+	rc = cudbg_get_buff(dbg_buff, size, &temp_buff);
+	if (rc)
+		return rc;
+
+	value = (u32 *)temp_buff.data;
+	for (i = 0; i < n; i++) {
+		for (j = t5_pcie_config_array[i][0];
+		     j <= t5_pcie_config_array[i][1]; j += 4) {
+			t4_hw_pci_read_cfg4(padap, j, value);
+			value++;
+		}
+	}
+	cudbg_write_and_release_buff(&temp_buff, dbg_buff);
+	return rc;
+}
+
+static int cudbg_sge_ctxt_check_valid(u32 *buf, int type)
+{
+	int index, bit, bit_pos = 0;
+
+	switch (type) {
+	case CTXT_EGRESS:
+		bit_pos = 176;
+		break;
+	case CTXT_INGRESS:
+		bit_pos = 141;
+		break;
+	case CTXT_FLM:
+		bit_pos = 89;
+		break;
+	}
+	index = bit_pos / 32;
+	bit =  bit_pos % 32;
+	return buf[index] & (1U << bit);
+}
+
+static int cudbg_get_ctxt_region_info(struct adapter *padap,
+				      struct cudbg_region_info *ctx_info,
+				      u8 *mem_type)
+{
+	struct cudbg_mem_desc mem_desc;
+	struct cudbg_meminfo meminfo;
+	u32 i, j, value, found;
 	u8 flq;
+	int rc;
 
+	rc = cudbg_fill_meminfo(padap, &meminfo);
+	if (rc)
+		return rc;
+
+	/* Get EGRESS and INGRESS context region size */
+	for (i = CTXT_EGRESS; i <= CTXT_INGRESS; i++) {
+		found = 0;
+		memset(&mem_desc, 0, sizeof(struct cudbg_mem_desc));
+		for (j = 0; j < ARRAY_SIZE(meminfo.avail); j++) {
+			rc = cudbg_get_mem_region(padap, &meminfo, j,
+						  cudbg_region[i],
+						  &mem_desc);
+			if (!rc) {
+				found = 1;
+				rc = cudbg_get_mem_relative(padap, &meminfo, j,
+							    &mem_desc.base,
+							    &mem_desc.limit);
+				if (rc) {
+					ctx_info[i].exist = false;
+					break;
+				}
+				ctx_info[i].exist = true;
+				ctx_info[i].start = mem_desc.base;
+				ctx_info[i].end = mem_desc.limit;
+				mem_type[i] = j;
+				break;
+			}
+		}
+		if (!found)
+			ctx_info[i].exist = false;
+	}
+
+	/* Get FLM and CNM max qid. */
 	value = t4_read_reg(padap, SGE_FLM_CFG_A);
 
 	/* Get number of data freelist queues */
 	flq = HDRSTARTFLQ_G(value);
-	size = CUDBG_MAX_FL_QIDS >> flq;
+	ctx_info[CTXT_FLM].exist = true;
+	ctx_info[CTXT_FLM].end = (CUDBG_MAX_FL_QIDS >> flq) * SGE_CTXT_SIZE;
 
-	/* Add extra space for congestion manager contexts.
-	 * The number of CONM contexts are same as number of freelist
+	/* The number of CONM contexts are same as number of freelist
 	 * queues.
 	 */
-	size += size;
+	ctx_info[CTXT_CNM].exist = true;
+	ctx_info[CTXT_CNM].end = ctx_info[CTXT_FLM].end;
+
+	return 0;
+}
+
+int cudbg_dump_context_size(struct adapter *padap)
+{
+	struct cudbg_region_info region_info[CTXT_CNM + 1] = { {0} };
+	u8 mem_type[CTXT_INGRESS + 1] = { 0 };
+	u32 i, size = 0;
+	int rc;
+
+	/* Get max valid qid for each type of queue */
+	rc = cudbg_get_ctxt_region_info(padap, region_info, mem_type);
+	if (rc)
+		return rc;
+
+	for (i = 0; i < CTXT_CNM; i++) {
+		if (!region_info[i].exist) {
+			if (i == CTXT_EGRESS || i == CTXT_INGRESS)
+				size += CUDBG_LOWMEM_MAX_CTXT_QIDS *
+					SGE_CTXT_SIZE;
+			continue;
+		}
+
+		size += (region_info[i].end - region_info[i].start + 1) /
+			SGE_CTXT_SIZE;
+	}
 	return size * sizeof(struct cudbg_ch_cntxt);
 }
 
@@ -1153,16 +1745,54 @@ static void cudbg_read_sge_ctxt(struct cudbg_init *pdbg_init, u32 cid,
 		t4_sge_ctxt_rd_bd(padap, cid, ctype, data);
 }
 
+static void cudbg_get_sge_ctxt_fw(struct cudbg_init *pdbg_init, u32 max_qid,
+				  u8 ctxt_type,
+				  struct cudbg_ch_cntxt **out_buff)
+{
+	struct cudbg_ch_cntxt *buff = *out_buff;
+	int rc;
+	u32 j;
+
+	for (j = 0; j < max_qid; j++) {
+		cudbg_read_sge_ctxt(pdbg_init, j, ctxt_type, buff->data);
+		rc = cudbg_sge_ctxt_check_valid(buff->data, ctxt_type);
+		if (!rc)
+			continue;
+
+		buff->cntxt_type = ctxt_type;
+		buff->cntxt_id = j;
+		buff++;
+		if (ctxt_type == CTXT_FLM) {
+			cudbg_read_sge_ctxt(pdbg_init, j, CTXT_CNM, buff->data);
+			buff->cntxt_type = CTXT_CNM;
+			buff->cntxt_id = j;
+			buff++;
+		}
+	}
+
+	*out_buff = buff;
+}
+
 int cudbg_collect_dump_context(struct cudbg_init *pdbg_init,
 			       struct cudbg_buffer *dbg_buff,
 			       struct cudbg_error *cudbg_err)
 {
+	struct cudbg_region_info region_info[CTXT_CNM + 1] = { {0} };
 	struct adapter *padap = pdbg_init->adap;
+	u32 j, size, max_ctx_size, max_ctx_qid;
+	u8 mem_type[CTXT_INGRESS + 1] = { 0 };
 	struct cudbg_buffer temp_buff = { 0 };
 	struct cudbg_ch_cntxt *buff;
-	u32 size, i = 0;
+	u64 *dst_off, *src_off;
+	u8 *ctx_buf;
+	u8 i, k;
 	int rc;
 
+	/* Get max valid qid for each type of queue */
+	rc = cudbg_get_ctxt_region_info(padap, region_info, mem_type);
+	if (rc)
+		return rc;
+
 	rc = cudbg_dump_context_size(padap);
 	if (rc <= 0)
 		return CUDBG_STATUS_ENTITY_NOT_FOUND;
@@ -1172,23 +1802,79 @@ int cudbg_collect_dump_context(struct cudbg_init *pdbg_init,
 	if (rc)
 		return rc;
 
+	/* Get buffer with enough space to read the biggest context
+	 * region in memory.
+	 */
+	max_ctx_size = max(region_info[CTXT_EGRESS].end -
+			   region_info[CTXT_EGRESS].start + 1,
+			   region_info[CTXT_INGRESS].end -
+			   region_info[CTXT_INGRESS].start + 1);
+
+	ctx_buf = kvzalloc(max_ctx_size, GFP_KERNEL);
+	if (!ctx_buf) {
+		cudbg_put_buff(&temp_buff, dbg_buff);
+		return -ENOMEM;
+	}
+
 	buff = (struct cudbg_ch_cntxt *)temp_buff.data;
-	while (size > 0) {
-		buff->cntxt_type = CTXT_FLM;
-		buff->cntxt_id = i;
-		cudbg_read_sge_ctxt(pdbg_init, i, CTXT_FLM, buff->data);
-		buff++;
-		size -= sizeof(struct cudbg_ch_cntxt);
 
-		buff->cntxt_type = CTXT_CNM;
-		buff->cntxt_id = i;
-		cudbg_read_sge_ctxt(pdbg_init, i, CTXT_CNM, buff->data);
-		buff++;
-		size -= sizeof(struct cudbg_ch_cntxt);
+	/* Collect EGRESS and INGRESS context data.
+	 * In case of failures, fallback to collecting via FW or
+	 * backdoor access.
+	 */
+	for (i = CTXT_EGRESS; i <= CTXT_INGRESS; i++) {
+		if (!region_info[i].exist) {
+			max_ctx_qid = CUDBG_LOWMEM_MAX_CTXT_QIDS;
+			cudbg_get_sge_ctxt_fw(pdbg_init, max_ctx_qid, i,
+					      &buff);
+			continue;
+		}
 
-		i++;
+		max_ctx_size = region_info[i].end - region_info[i].start + 1;
+		max_ctx_qid = max_ctx_size / SGE_CTXT_SIZE;
+
+		t4_sge_ctxt_flush(padap, padap->mbox, i);
+		rc = t4_memory_rw(padap, MEMWIN_NIC, mem_type[i],
+				  region_info[i].start, max_ctx_size,
+				  (__be32 *)ctx_buf, 1);
+		if (rc) {
+			max_ctx_qid = CUDBG_LOWMEM_MAX_CTXT_QIDS;
+			cudbg_get_sge_ctxt_fw(pdbg_init, max_ctx_qid, i,
+					      &buff);
+			continue;
+		}
+
+		for (j = 0; j < max_ctx_qid; j++) {
+			src_off = (u64 *)(ctx_buf + j * SGE_CTXT_SIZE);
+			dst_off = (u64 *)buff->data;
+
+			/* The data is stored in 64-bit cpu order.  Convert it
+			 * to big endian before parsing.
+			 */
+			for (k = 0; k < SGE_CTXT_SIZE / sizeof(u64); k++)
+				dst_off[k] = cpu_to_be64(src_off[k]);
+
+			rc = cudbg_sge_ctxt_check_valid(buff->data, i);
+			if (!rc)
+				continue;
+
+			buff->cntxt_type = i;
+			buff->cntxt_id = j;
+			buff++;
+		}
 	}
 
+	kvfree(ctx_buf);
+
+	/* Collect FREELIST and CONGESTION MANAGER contexts */
+	max_ctx_size = region_info[CTXT_FLM].end -
+		       region_info[CTXT_FLM].start + 1;
+	max_ctx_qid = max_ctx_size / SGE_CTXT_SIZE;
+	/* Since FLM and CONM are 1-to-1 mapped, the below function
+	 * will fetch both FLM and CONM contexts.
+	 */
+	cudbg_get_sge_ctxt_fw(pdbg_init, max_ctx_qid, CTXT_FLM, &buff);
+
 	cudbg_write_and_release_buff(&temp_buff, dbg_buff);
 	return rc;
 }
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.h b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.h
index caeee8e33e86..eebefe7cd18e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.h
@@ -75,6 +75,12 @@ int cudbg_collect_edc0_meminfo(struct cudbg_init *pdbg_init,
 int cudbg_collect_edc1_meminfo(struct cudbg_init *pdbg_init,
 			       struct cudbg_buffer *dbg_buff,
 			       struct cudbg_error *cudbg_err);
+int cudbg_collect_mc0_meminfo(struct cudbg_init *pdbg_init,
+			      struct cudbg_buffer *dbg_buff,
+			      struct cudbg_error *cudbg_err);
+int cudbg_collect_mc1_meminfo(struct cudbg_init *pdbg_init,
+			      struct cudbg_buffer *dbg_buff,
+			      struct cudbg_error *cudbg_err);
 int cudbg_collect_rss(struct cudbg_init *pdbg_init,
 		      struct cudbg_buffer *dbg_buff,
 		      struct cudbg_error *cudbg_err);
@@ -102,6 +108,9 @@ int cudbg_collect_ulprx_la(struct cudbg_init *pdbg_init,
 int cudbg_collect_tp_la(struct cudbg_init *pdbg_init,
 			struct cudbg_buffer *dbg_buff,
 			struct cudbg_error *cudbg_err);
+int cudbg_collect_meminfo(struct cudbg_init *pdbg_init,
+			  struct cudbg_buffer *dbg_buff,
+			  struct cudbg_error *cudbg_err);
 int cudbg_collect_cim_pif_la(struct cudbg_init *pdbg_init,
 			     struct cudbg_buffer *dbg_buff,
 			     struct cudbg_error *cudbg_err);
@@ -123,6 +132,9 @@ int cudbg_collect_pm_indirect(struct cudbg_init *pdbg_init,
 int cudbg_collect_tid(struct cudbg_init *pdbg_init,
 		      struct cudbg_buffer *dbg_buff,
 		      struct cudbg_error *cudbg_err);
+int cudbg_collect_pcie_config(struct cudbg_init *pdbg_init,
+			      struct cudbg_buffer *dbg_buff,
+			      struct cudbg_error *cudbg_err);
 int cudbg_collect_dump_context(struct cudbg_init *pdbg_init,
 			       struct cudbg_buffer *dbg_buff,
 			       struct cudbg_error *cudbg_err);
@@ -156,6 +168,9 @@ int cudbg_collect_mbox_log(struct cudbg_init *pdbg_init,
 int cudbg_collect_hma_indirect(struct cudbg_init *pdbg_init,
 			       struct cudbg_buffer *dbg_buff,
 			       struct cudbg_error *cudbg_err);
+int cudbg_collect_hma_meminfo(struct cudbg_init *pdbg_init,
+			      struct cudbg_buffer *dbg_buff,
+			      struct cudbg_error *cudbg_err);
 
 struct cudbg_entity_hdr *cudbg_get_entity_hdr(void *outbuf, int i);
 void cudbg_align_debug_buffer(struct cudbg_buffer *dbg_buff,
@@ -163,7 +178,8 @@ void cudbg_align_debug_buffer(struct cudbg_buffer *dbg_buff,
 u32 cudbg_cim_obq_size(struct adapter *padap, int qid);
 int cudbg_dump_context_size(struct adapter *padap);
 
-struct cudbg_tcam;
+int cudbg_fill_meminfo(struct adapter *padap,
+		       struct cudbg_meminfo *meminfo_buff);
 void cudbg_fill_le_tcam_info(struct adapter *padap,
 			     struct cudbg_tcam *tcam_region);
 #endif /* __CUDBG_LIB_H__ */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 6f9fa6e3c42a..97dc3efeb234 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -77,7 +77,8 @@ enum {
 	MEM_EDC1,
 	MEM_MC,
 	MEM_MC0 = MEM_MC,
-	MEM_MC1
+	MEM_MC1,
+	MEM_HMA,
 };
 
 enum {
@@ -1653,7 +1654,7 @@ int t4_ctrl_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
 		    unsigned int vf, unsigned int eqid);
 int t4_ofld_eq_free(struct adapter *adap, unsigned int mbox, unsigned int pf,
 		    unsigned int vf, unsigned int eqid);
-int t4_sge_ctxt_flush(struct adapter *adap, unsigned int mbox);
+int t4_sge_ctxt_flush(struct adapter *adap, unsigned int mbox, int ctxt_type);
 void t4_handle_get_port_info(struct port_info *pi, const __be64 *rpl);
 int t4_update_port_info(struct port_info *pi);
 int t4_get_link_params(struct port_info *pi, unsigned int *link_okp,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
index 29cc625e9833..41c8736314f8 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c
@@ -18,11 +18,13 @@
 #include "t4_regs.h"
 #include "cxgb4.h"
 #include "cxgb4_cudbg.h"
-#include "cudbg_entity.h"
 
 static const struct cxgb4_collect_entity cxgb4_collect_mem_dump[] = {
 	{ CUDBG_EDC0, cudbg_collect_edc0_meminfo },
 	{ CUDBG_EDC1, cudbg_collect_edc1_meminfo },
+	{ CUDBG_MC0, cudbg_collect_mc0_meminfo },
+	{ CUDBG_MC1, cudbg_collect_mc1_meminfo },
+	{ CUDBG_HMA, cudbg_collect_hma_meminfo },
 };
 
 static const struct cxgb4_collect_entity cxgb4_collect_hw_dump[] = {
@@ -53,6 +55,7 @@ static const struct cxgb4_collect_entity cxgb4_collect_hw_dump[] = {
 	{ CUDBG_SGE_INDIRECT, cudbg_collect_sge_indirect },
 	{ CUDBG_ULPRX_LA, cudbg_collect_ulprx_la },
 	{ CUDBG_TP_LA, cudbg_collect_tp_la },
+	{ CUDBG_MEMINFO, cudbg_collect_meminfo },
 	{ CUDBG_CIM_PIF_LA, cudbg_collect_cim_pif_la },
 	{ CUDBG_CLK, cudbg_collect_clk_info },
 	{ CUDBG_CIM_OBQ_RXQ0, cudbg_collect_obq_sge_rx_q0 },
@@ -60,6 +63,7 @@ static const struct cxgb4_collect_entity cxgb4_collect_hw_dump[] = {
 	{ CUDBG_PCIE_INDIRECT, cudbg_collect_pcie_indirect },
 	{ CUDBG_PM_INDIRECT, cudbg_collect_pm_indirect },
 	{ CUDBG_TID_INFO, cudbg_collect_tid },
+	{ CUDBG_PCIE_CONFIG, cudbg_collect_pcie_config },
 	{ CUDBG_DUMP_CONTEXT, cudbg_collect_dump_context },
 	{ CUDBG_MPS_TCAM, cudbg_collect_mps_tcam },
 	{ CUDBG_VPD_DATA, cudbg_collect_vpd_data },
@@ -158,6 +162,22 @@ static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity)
 		}
 		len = cudbg_mbytes_to_bytes(len);
 		break;
+	case CUDBG_MC0:
+		value = t4_read_reg(adap, MA_TARGET_MEM_ENABLE_A);
+		if (value & EXT_MEM0_ENABLE_F) {
+			value = t4_read_reg(adap, MA_EXT_MEMORY0_BAR_A);
+			len = EXT_MEM0_SIZE_G(value);
+		}
+		len = cudbg_mbytes_to_bytes(len);
+		break;
+	case CUDBG_MC1:
+		value = t4_read_reg(adap, MA_TARGET_MEM_ENABLE_A);
+		if (value & EXT_MEM1_ENABLE_F) {
+			value = t4_read_reg(adap, MA_EXT_MEMORY1_BAR_A);
+			len = EXT_MEM1_SIZE_G(value);
+		}
+		len = cudbg_mbytes_to_bytes(len);
+		break;
 	case CUDBG_RSS:
 		len = RSS_NENTRIES * sizeof(u16);
 		break;
@@ -201,6 +221,9 @@ static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity)
 	case CUDBG_TP_LA:
 		len = sizeof(struct cudbg_tp_la) + TPLA_SIZE * sizeof(u64);
 		break;
+	case CUDBG_MEMINFO:
+		len = sizeof(struct cudbg_meminfo);
+		break;
 	case CUDBG_CIM_PIF_LA:
 		len = sizeof(struct cudbg_cim_pif_la);
 		len += 2 * CIM_PIFLA_SIZE * 6 * sizeof(u32);
@@ -219,6 +242,9 @@ static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity)
 	case CUDBG_TID_INFO:
 		len = sizeof(struct cudbg_tid_info_region_rev1);
 		break;
+	case CUDBG_PCIE_CONFIG:
+		len = sizeof(u32) * CUDBG_NUM_PCIE_CONFIG_REGS;
+		break;
 	case CUDBG_DUMP_CONTEXT:
 		len = cudbg_dump_context_size(adap);
 		break;
@@ -264,6 +290,17 @@ static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity)
 			len = sizeof(struct ireg_buf) * n;
 		}
 		break;
+	case CUDBG_HMA:
+		value = t4_read_reg(adap, MA_TARGET_MEM_ENABLE_A);
+		if (value & HMA_MUX_F) {
+			/* In T6, there's no MC1.  So, HMA shares MC1
+			 * address space.
+			 */
+			value = t4_read_reg(adap, MA_EXT_MEMORY1_BAR_A);
+			len = EXT_MEM1_SIZE_G(value);
+		}
+		len = cudbg_mbytes_to_bytes(len);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h
index c099b5aa2214..7ceeb0bc9fa8 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h
@@ -20,6 +20,7 @@
 
 #include "cudbg_if.h"
 #include "cudbg_lib_common.h"
+#include "cudbg_entity.h"
 #include "cudbg_lib.h"
 
 typedef int (*cudbg_collect_callback_t)(struct cudbg_init *pdbg_init,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 917663b35603..4956e429ae1d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -45,6 +45,10 @@
 #include "cxgb4_debugfs.h"
 #include "clip_tbl.h"
 #include "l2t.h"
+#include "cudbg_if.h"
+#include "cudbg_lib_common.h"
+#include "cudbg_entity.h"
+#include "cudbg_lib.h"
 
 /* generic seq_file support for showing a table of size rows x width. */
 static void *seq_tab_get_idx(struct seq_tab *tb, loff_t pos)
@@ -2794,18 +2798,6 @@ static const struct file_operations blocked_fl_fops = {
 	.llseek  = generic_file_llseek,
 };
 
-struct mem_desc {
-	unsigned int base;
-	unsigned int limit;
-	unsigned int idx;
-};
-
-static int mem_desc_cmp(const void *a, const void *b)
-{
-	return ((const struct mem_desc *)a)->base -
-	       ((const struct mem_desc *)b)->base;
-}
-
 static void mem_region_show(struct seq_file *seq, const char *name,
 			    unsigned int from, unsigned int to)
 {
@@ -2819,250 +2811,60 @@ static void mem_region_show(struct seq_file *seq, const char *name,
 static int meminfo_show(struct seq_file *seq, void *v)
 {
 	static const char * const memory[] = { "EDC0:", "EDC1:", "MC:",
-					"MC0:", "MC1:"};
-	static const char * const region[] = {
-		"DBQ contexts:", "IMSG contexts:", "FLM cache:", "TCBs:",
-		"Pstructs:", "Timers:", "Rx FL:", "Tx FL:", "Pstruct FL:",
-		"Tx payload:", "Rx payload:", "LE hash:", "iSCSI region:",
-		"TDDP region:", "TPT region:", "STAG region:", "RQ region:",
-		"RQUDP region:", "PBL region:", "TXPBL region:",
-		"DBVFIFO region:", "ULPRX state:", "ULPTX state:",
-		"On-chip queues:"
-	};
-
-	int i, n;
-	u32 lo, hi, used, alloc;
-	struct mem_desc avail[4];
-	struct mem_desc mem[ARRAY_SIZE(region) + 3];      /* up to 3 holes */
-	struct mem_desc *md = mem;
+					       "MC0:", "MC1:", "HMA:"};
 	struct adapter *adap = seq->private;
+	struct cudbg_meminfo meminfo;
+	int i, rc;
 
-	for (i = 0; i < ARRAY_SIZE(mem); i++) {
-		mem[i].limit = 0;
-		mem[i].idx = i;
-	}
-
-	/* Find and sort the populated memory ranges */
-	i = 0;
-	lo = t4_read_reg(adap, MA_TARGET_MEM_ENABLE_A);
-	if (lo & EDRAM0_ENABLE_F) {
-		hi = t4_read_reg(adap, MA_EDRAM0_BAR_A);
-		avail[i].base = EDRAM0_BASE_G(hi) << 20;
-		avail[i].limit = avail[i].base + (EDRAM0_SIZE_G(hi) << 20);
-		avail[i].idx = 0;
-		i++;
-	}
-	if (lo & EDRAM1_ENABLE_F) {
-		hi = t4_read_reg(adap, MA_EDRAM1_BAR_A);
-		avail[i].base = EDRAM1_BASE_G(hi) << 20;
-		avail[i].limit = avail[i].base + (EDRAM1_SIZE_G(hi) << 20);
-		avail[i].idx = 1;
-		i++;
-	}
-
-	if (is_t5(adap->params.chip)) {
-		if (lo & EXT_MEM0_ENABLE_F) {
-			hi = t4_read_reg(adap, MA_EXT_MEMORY0_BAR_A);
-			avail[i].base = EXT_MEM0_BASE_G(hi) << 20;
-			avail[i].limit =
-				avail[i].base + (EXT_MEM0_SIZE_G(hi) << 20);
-			avail[i].idx = 3;
-			i++;
-		}
-		if (lo & EXT_MEM1_ENABLE_F) {
-			hi = t4_read_reg(adap, MA_EXT_MEMORY1_BAR_A);
-			avail[i].base = EXT_MEM1_BASE_G(hi) << 20;
-			avail[i].limit =
-				avail[i].base + (EXT_MEM1_SIZE_G(hi) << 20);
-			avail[i].idx = 4;
-			i++;
-		}
-	} else {
-		if (lo & EXT_MEM_ENABLE_F) {
-			hi = t4_read_reg(adap, MA_EXT_MEMORY_BAR_A);
-			avail[i].base = EXT_MEM_BASE_G(hi) << 20;
-			avail[i].limit =
-				avail[i].base + (EXT_MEM_SIZE_G(hi) << 20);
-			avail[i].idx = 2;
-			i++;
-		}
-	}
-	if (!i)                                    /* no memory available */
-		return 0;
-	sort(avail, i, sizeof(struct mem_desc), mem_desc_cmp, NULL);
-
-	(md++)->base = t4_read_reg(adap, SGE_DBQ_CTXT_BADDR_A);
-	(md++)->base = t4_read_reg(adap, SGE_IMSG_CTXT_BADDR_A);
-	(md++)->base = t4_read_reg(adap, SGE_FLM_CACHE_BADDR_A);
-	(md++)->base = t4_read_reg(adap, TP_CMM_TCB_BASE_A);
-	(md++)->base = t4_read_reg(adap, TP_CMM_MM_BASE_A);
-	(md++)->base = t4_read_reg(adap, TP_CMM_TIMER_BASE_A);
-	(md++)->base = t4_read_reg(adap, TP_CMM_MM_RX_FLST_BASE_A);
-	(md++)->base = t4_read_reg(adap, TP_CMM_MM_TX_FLST_BASE_A);
-	(md++)->base = t4_read_reg(adap, TP_CMM_MM_PS_FLST_BASE_A);
-
-	/* the next few have explicit upper bounds */
-	md->base = t4_read_reg(adap, TP_PMM_TX_BASE_A);
-	md->limit = md->base - 1 +
-		    t4_read_reg(adap, TP_PMM_TX_PAGE_SIZE_A) *
-		    PMTXMAXPAGE_G(t4_read_reg(adap, TP_PMM_TX_MAX_PAGE_A));
-	md++;
-
-	md->base = t4_read_reg(adap, TP_PMM_RX_BASE_A);
-	md->limit = md->base - 1 +
-		    t4_read_reg(adap, TP_PMM_RX_PAGE_SIZE_A) *
-		    PMRXMAXPAGE_G(t4_read_reg(adap, TP_PMM_RX_MAX_PAGE_A));
-	md++;
-
-	if (t4_read_reg(adap, LE_DB_CONFIG_A) & HASHEN_F) {
-		if (CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5) {
-			hi = t4_read_reg(adap, LE_DB_TID_HASHBASE_A) / 4;
-			md->base = t4_read_reg(adap, LE_DB_HASH_TID_BASE_A);
-		 } else {
-			hi = t4_read_reg(adap, LE_DB_HASH_TID_BASE_A);
-			md->base = t4_read_reg(adap,
-					       LE_DB_HASH_TBL_BASE_ADDR_A);
-		}
-		md->limit = 0;
-	} else {
-		md->base = 0;
-		md->idx = ARRAY_SIZE(region);  /* hide it */
-	}
-	md++;
-
-#define ulp_region(reg) do { \
-	md->base = t4_read_reg(adap, ULP_ ## reg ## _LLIMIT_A);\
-	(md++)->limit = t4_read_reg(adap, ULP_ ## reg ## _ULIMIT_A); \
-} while (0)
-
-	ulp_region(RX_ISCSI);
-	ulp_region(RX_TDDP);
-	ulp_region(TX_TPT);
-	ulp_region(RX_STAG);
-	ulp_region(RX_RQ);
-	ulp_region(RX_RQUDP);
-	ulp_region(RX_PBL);
-	ulp_region(TX_PBL);
-#undef ulp_region
-	md->base = 0;
-	md->idx = ARRAY_SIZE(region);
-	if (!is_t4(adap->params.chip)) {
-		u32 size = 0;
-		u32 sge_ctrl = t4_read_reg(adap, SGE_CONTROL2_A);
-		u32 fifo_size = t4_read_reg(adap, SGE_DBVFIFO_SIZE_A);
-
-		if (is_t5(adap->params.chip)) {
-			if (sge_ctrl & VFIFO_ENABLE_F)
-				size = DBVFIFO_SIZE_G(fifo_size);
-		} else {
-			size = T6_DBVFIFO_SIZE_G(fifo_size);
-		}
-
-		if (size) {
-			md->base = BASEADDR_G(t4_read_reg(adap,
-					SGE_DBVFIFO_BADDR_A));
-			md->limit = md->base + (size << 2) - 1;
-		}
-	}
-
-	md++;
-
-	md->base = t4_read_reg(adap, ULP_RX_CTX_BASE_A);
-	md->limit = 0;
-	md++;
-	md->base = t4_read_reg(adap, ULP_TX_ERR_TABLE_BASE_A);
-	md->limit = 0;
-	md++;
-
-	md->base = adap->vres.ocq.start;
-	if (adap->vres.ocq.size)
-		md->limit = md->base + adap->vres.ocq.size - 1;
-	else
-		md->idx = ARRAY_SIZE(region);  /* hide it */
-	md++;
-
-	/* add any address-space holes, there can be up to 3 */
-	for (n = 0; n < i - 1; n++)
-		if (avail[n].limit < avail[n + 1].base)
-			(md++)->base = avail[n].limit;
-	if (avail[n].limit)
-		(md++)->base = avail[n].limit;
-
-	n = md - mem;
-	sort(mem, n, sizeof(struct mem_desc), mem_desc_cmp, NULL);
+	memset(&meminfo, 0, sizeof(struct cudbg_meminfo));
+	rc = cudbg_fill_meminfo(adap, &meminfo);
+	if (rc)
+		return -ENXIO;
 
-	for (lo = 0; lo < i; lo++)
-		mem_region_show(seq, memory[avail[lo].idx], avail[lo].base,
-				avail[lo].limit - 1);
+	for (i = 0; i < meminfo.avail_c; i++)
+		mem_region_show(seq, memory[meminfo.avail[i].idx],
+				meminfo.avail[i].base,
+				meminfo.avail[i].limit - 1);
 
 	seq_putc(seq, '\n');
-	for (i = 0; i < n; i++) {
-		if (mem[i].idx >= ARRAY_SIZE(region))
+	for (i = 0; i < meminfo.mem_c; i++) {
+		if (meminfo.mem[i].idx >= ARRAY_SIZE(cudbg_region))
 			continue;                        /* skip holes */
-		if (!mem[i].limit)
-			mem[i].limit = i < n - 1 ? mem[i + 1].base - 1 : ~0;
-		mem_region_show(seq, region[mem[i].idx], mem[i].base,
-				mem[i].limit);
+		if (!meminfo.mem[i].limit)
+			meminfo.mem[i].limit =
+				i < meminfo.mem_c - 1 ?
+				meminfo.mem[i + 1].base - 1 : ~0;
+		mem_region_show(seq, cudbg_region[meminfo.mem[i].idx],
+				meminfo.mem[i].base, meminfo.mem[i].limit);
 	}
 
 	seq_putc(seq, '\n');
-	lo = t4_read_reg(adap, CIM_SDRAM_BASE_ADDR_A);
-	hi = t4_read_reg(adap, CIM_SDRAM_ADDR_SIZE_A) + lo - 1;
-	mem_region_show(seq, "uP RAM:", lo, hi);
+	mem_region_show(seq, "uP RAM:", meminfo.up_ram_lo, meminfo.up_ram_hi);
+	mem_region_show(seq, "uP Extmem2:", meminfo.up_extmem2_lo,
+			meminfo.up_extmem2_hi);
 
-	lo = t4_read_reg(adap, CIM_EXTMEM2_BASE_ADDR_A);
-	hi = t4_read_reg(adap, CIM_EXTMEM2_ADDR_SIZE_A) + lo - 1;
-	mem_region_show(seq, "uP Extmem2:", lo, hi);
-
-	lo = t4_read_reg(adap, TP_PMM_RX_MAX_PAGE_A);
 	seq_printf(seq, "\n%u Rx pages of size %uKiB for %u channels\n",
-		   PMRXMAXPAGE_G(lo),
-		   t4_read_reg(adap, TP_PMM_RX_PAGE_SIZE_A) >> 10,
-		   (lo & PMRXNUMCHN_F) ? 2 : 1);
+		   meminfo.rx_pages_data[0], meminfo.rx_pages_data[1],
+		   meminfo.rx_pages_data[2]);
 
-	lo = t4_read_reg(adap, TP_PMM_TX_MAX_PAGE_A);
-	hi = t4_read_reg(adap, TP_PMM_TX_PAGE_SIZE_A);
 	seq_printf(seq, "%u Tx pages of size %u%ciB for %u channels\n",
-		   PMTXMAXPAGE_G(lo),
-		   hi >= (1 << 20) ? (hi >> 20) : (hi >> 10),
-		   hi >= (1 << 20) ? 'M' : 'K', 1 << PMTXNUMCHN_G(lo));
-	seq_printf(seq, "%u p-structs\n\n",
-		   t4_read_reg(adap, TP_CMM_MM_MAX_PSTRUCT_A));
-
-	for (i = 0; i < 4; i++) {
-		if (CHELSIO_CHIP_VERSION(adap->params.chip) > CHELSIO_T5)
-			lo = t4_read_reg(adap, MPS_RX_MAC_BG_PG_CNT0_A + i * 4);
-		else
-			lo = t4_read_reg(adap, MPS_RX_PG_RSV0_A + i * 4);
-		if (is_t5(adap->params.chip)) {
-			used = T5_USED_G(lo);
-			alloc = T5_ALLOC_G(lo);
-		} else {
-			used = USED_G(lo);
-			alloc = ALLOC_G(lo);
-		}
+		   meminfo.tx_pages_data[0], meminfo.tx_pages_data[1],
+		   meminfo.tx_pages_data[2], meminfo.tx_pages_data[3]);
+
+	seq_printf(seq, "%u p-structs\n\n", meminfo.p_structs);
+
+	for (i = 0; i < 4; i++)
 		/* For T6 these are MAC buffer groups */
 		seq_printf(seq, "Port %d using %u pages out of %u allocated\n",
-			   i, used, alloc);
-	}
-	for (i = 0; i < adap->params.arch.nchan; i++) {
-		if (CHELSIO_CHIP_VERSION(adap->params.chip) > CHELSIO_T5)
-			lo = t4_read_reg(adap,
-					 MPS_RX_LPBK_BG_PG_CNT0_A + i * 4);
-		else
-			lo = t4_read_reg(adap, MPS_RX_PG_RSV4_A + i * 4);
-		if (is_t5(adap->params.chip)) {
-			used = T5_USED_G(lo);
-			alloc = T5_ALLOC_G(lo);
-		} else {
-			used = USED_G(lo);
-			alloc = ALLOC_G(lo);
-		}
+			   i, meminfo.port_used[i], meminfo.port_alloc[i]);
+
+	for (i = 0; i < adap->params.arch.nchan; i++)
 		/* For T6 these are MAC buffer groups */
 		seq_printf(seq,
 			   "Loopback %d using %u pages out of %u allocated\n",
-			   i, used, alloc);
-	}
+			   i, meminfo.loopback_used[i],
+			   meminfo.loopback_alloc[i]);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 6f900ffe25cc..87ac1e4dafc1 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -1673,7 +1673,7 @@ int cxgb4_flush_eq_cache(struct net_device *dev)
 {
 	struct adapter *adap = netdev2adap(dev);
 
-	return t4_sge_ctxt_flush(adap, adap->mbox);
+	return t4_sge_ctxt_flush(adap, adap->mbox, CTXT_EGRESS);
 }
 EXPORT_SYMBOL(cxgb4_flush_eq_cache);
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index d4a548a6a55c..a12b894f135d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -405,9 +405,7 @@ static void cxgb4_process_flow_actions(struct net_device *in,
 		} else if (is_tcf_gact_shot(a)) {
 			fs->action = FILTER_DROP;
 		} else if (is_tcf_mirred_egress_redirect(a)) {
-			int ifindex = tcf_mirred_ifindex(a);
-			struct net_device *out = __dev_get_by_index(dev_net(in),
-								    ifindex);
+			struct net_device *out = tcf_mirred_dev(a);
 			struct port_info *pi = netdev_priv(out);
 
 			fs->action = FILTER_SWITCH;
@@ -582,14 +580,14 @@ static int cxgb4_validate_flow_actions(struct net_device *dev,
 			/* Do nothing */
 		} else if (is_tcf_mirred_egress_redirect(a)) {
 			struct adapter *adap = netdev2adap(dev);
-			struct net_device *n_dev;
-			unsigned int i, ifindex;
+			struct net_device *n_dev, *target_dev;
+			unsigned int i;
 			bool found = false;
 
-			ifindex = tcf_mirred_ifindex(a);
+			target_dev = tcf_mirred_dev(a);
 			for_each_port(adap, i) {
 				n_dev = adap->port[i];
-				if (ifindex == n_dev->ifindex) {
+				if (target_dev == n_dev) {
 					found = true;
 					break;
 				}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
index cd0cd13a964d..ab174bcfbfb0 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
@@ -114,14 +114,14 @@ static int fill_action_fields(struct adapter *adap,
 
 		/* Re-direct to specified port in hardware. */
 		if (is_tcf_mirred_egress_redirect(a)) {
-			struct net_device *n_dev;
-			unsigned int i, index;
+			struct net_device *n_dev, *target_dev;
 			bool found = false;
+			unsigned int i;
 
-			index = tcf_mirred_ifindex(a);
+			target_dev = tcf_mirred_dev(a);
 			for_each_port(adap, i) {
 				n_dev = adap->port[i];
-				if (index == n_dev->ifindex) {
+				if (target_dev == n_dev) {
 					fs->action = FILTER_SWITCH;
 					fs->eport = i;
 					found = true;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index f63210f15579..112963defd0b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -524,11 +524,14 @@ int t4_memory_rw(struct adapter *adap, int win, int mtype, u32 addr,
 	 * MEM_EDC1 = 1
 	 * MEM_MC   = 2 -- MEM_MC for chips with only 1 memory controller
 	 * MEM_MC1  = 3 -- for chips with 2 memory controllers (e.g. T5)
+	 * MEM_HMA  = 4
 	 */
 	edc_size  = EDRAM0_SIZE_G(t4_read_reg(adap, MA_EDRAM0_BAR_A));
-	if (mtype != MEM_MC1)
+	if (mtype == MEM_HMA) {
+		memoffset = 2 * (edc_size * 1024 * 1024);
+	} else if (mtype != MEM_MC1) {
 		memoffset = (mtype * (edc_size * 1024 * 1024));
-	else {
+	} else {
 		mc_size = EXT_MEM0_SIZE_G(t4_read_reg(adap,
 						      MA_EXT_MEMORY0_BAR_A));
 		memoffset = (MEM_MC0 * edc_size + mc_size) * 1024 * 1024;
@@ -6527,18 +6530,21 @@ void t4_sge_decode_idma_state(struct adapter *adapter, int state)
  *      t4_sge_ctxt_flush - flush the SGE context cache
  *      @adap: the adapter
  *      @mbox: mailbox to use for the FW command
+ *      @ctx_type: Egress or Ingress
  *
  *      Issues a FW command through the given mailbox to flush the
  *      SGE context cache.
  */
-int t4_sge_ctxt_flush(struct adapter *adap, unsigned int mbox)
+int t4_sge_ctxt_flush(struct adapter *adap, unsigned int mbox, int ctxt_type)
 {
 	int ret;
 	u32 ldst_addrspace;
 	struct fw_ldst_cmd c;
 
 	memset(&c, 0, sizeof(c));
-	ldst_addrspace = FW_LDST_CMD_ADDRSPACE_V(FW_LDST_ADDRSPC_SGE_EGRC);
+	ldst_addrspace = FW_LDST_CMD_ADDRSPACE_V(ctxt_type == CTXT_EGRESS ?
+						 FW_LDST_ADDRSPC_SGE_EGRC :
+						 FW_LDST_ADDRSPC_SGE_INGC);
 	c.op_to_addrspace = cpu_to_be32(FW_CMD_OP_V(FW_LDST_CMD) |
 					FW_CMD_REQUEST_F | FW_CMD_READ_F |
 					ldst_addrspace);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
index a964ed184356..83afb32c8491 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
@@ -70,7 +70,9 @@ enum {
 
 /* SGE context types */
 enum ctxt_type {
-	CTXT_FLM = 2,
+	CTXT_EGRESS,
+	CTXT_INGRESS,
+	CTXT_FLM,
 	CTXT_CNM,
 };
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
index a7cfece72828..f6701e0a6701 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
@@ -961,6 +961,10 @@
 
 #define MA_EXT_MEMORY1_BAR_A 0x7808
 
+#define HMA_MUX_S    5
+#define HMA_MUX_V(x) ((x) << HMA_MUX_S)
+#define HMA_MUX_F    HMA_MUX_V(1U)
+
 #define EXT_MEM1_BASE_S    16
 #define EXT_MEM1_BASE_M    0xfffU
 #define EXT_MEM1_BASE_G(x) (((x) >> EXT_MEM1_BASE_S) & EXT_MEM1_BASE_M)
diff --git a/drivers/net/ethernet/cisco/enic/enic_ethtool.c b/drivers/net/ethernet/cisco/enic/enic_ethtool.c
index 462d0ce51240..efb9333c7cf8 100644
--- a/drivers/net/ethernet/cisco/enic/enic_ethtool.c
+++ b/drivers/net/ethernet/cisco/enic/enic_ethtool.c
@@ -18,6 +18,7 @@
 
 #include <linux/netdevice.h>
 #include <linux/ethtool.h>
+#include <linux/net_tstamp.h>
 
 #include "enic_res.h"
 #include "enic.h"
@@ -578,6 +579,16 @@ static int enic_set_rxfh(struct net_device *netdev, const u32 *indir,
 	return __enic_set_rsskey(enic);
 }
 
+static int enic_get_ts_info(struct net_device *netdev,
+			    struct ethtool_ts_info *info)
+{
+	info->so_timestamping = SOF_TIMESTAMPING_TX_SOFTWARE |
+				SOF_TIMESTAMPING_RX_SOFTWARE |
+				SOF_TIMESTAMPING_SOFTWARE;
+
+	return 0;
+}
+
 static const struct ethtool_ops enic_ethtool_ops = {
 	.get_drvinfo = enic_get_drvinfo,
 	.get_msglevel = enic_get_msglevel,
@@ -597,6 +608,7 @@ static const struct ethtool_ops enic_ethtool_ops = {
 	.get_rxfh = enic_get_rxfh,
 	.set_rxfh = enic_set_rxfh,
 	.get_link_ksettings = enic_get_ksettings,
+	.get_ts_info = enic_get_ts_info,
 };
 
 void enic_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index e130fb757e7b..d98676e43e03 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -856,6 +856,7 @@ static netdev_tx_t enic_hard_start_xmit(struct sk_buff *skb,
 
 	if (vnic_wq_desc_avail(wq) < MAX_SKB_FRAGS + ENIC_DESC_MAX_SPLITS)
 		netif_tx_stop_queue(txq);
+	skb_tx_timestamp(skb);
 	if (!skb->xmit_more || netif_xmit_stopped(txq))
 		vnic_wq_doorbell(wq);
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 59ed806a52c3..d07c700c7ff8 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -17,7 +17,7 @@
 #include <linux/netdevice.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
-
+#include <net/rtnetlink.h>
 #include "hclge_cmd.h"
 #include "hclge_dcb.h"
 #include "hclge_main.h"
@@ -2226,6 +2226,12 @@ static int hclge_mac_init(struct hclge_dev *hdev)
 	return hclge_cfg_func_mta_filter(hdev, 0, hdev->accept_mta_mc);
 }
 
+static void hclge_reset_task_schedule(struct hclge_dev *hdev)
+{
+	if (!test_and_set_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev->state))
+		schedule_work(&hdev->rst_service_task);
+}
+
 static void hclge_task_schedule(struct hclge_dev *hdev)
 {
 	if (!test_bit(HCLGE_STATE_DOWN, &hdev->state) &&
@@ -2362,6 +2368,46 @@ static void hclge_service_complete(struct hclge_dev *hdev)
 	clear_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state);
 }
 
+static u32 hclge_check_event_cause(struct hclge_dev *hdev, u32 *clearval)
+{
+	u32 rst_src_reg;
+
+	/* fetch the events from their corresponding regs */
+	rst_src_reg = hclge_read_dev(&hdev->hw, HCLGE_MISC_RESET_STS_REG);
+
+	/* check for vector0 reset event sources */
+	if (BIT(HCLGE_VECTOR0_GLOBALRESET_INT_B) & rst_src_reg) {
+		set_bit(HNAE3_GLOBAL_RESET, &hdev->reset_pending);
+		*clearval = BIT(HCLGE_VECTOR0_GLOBALRESET_INT_B);
+		return HCLGE_VECTOR0_EVENT_RST;
+	}
+
+	if (BIT(HCLGE_VECTOR0_CORERESET_INT_B) & rst_src_reg) {
+		set_bit(HNAE3_CORE_RESET, &hdev->reset_pending);
+		*clearval = BIT(HCLGE_VECTOR0_CORERESET_INT_B);
+		return HCLGE_VECTOR0_EVENT_RST;
+	}
+
+	if (BIT(HCLGE_VECTOR0_IMPRESET_INT_B) & rst_src_reg) {
+		set_bit(HNAE3_IMP_RESET, &hdev->reset_pending);
+		*clearval = BIT(HCLGE_VECTOR0_IMPRESET_INT_B);
+		return HCLGE_VECTOR0_EVENT_RST;
+	}
+
+	/* mailbox event sharing vector 0 interrupt would be placed here */
+
+	return HCLGE_VECTOR0_EVENT_OTHER;
+}
+
+static void hclge_clear_event_cause(struct hclge_dev *hdev, u32 event_type,
+				    u32 regclr)
+{
+	if (event_type == HCLGE_VECTOR0_EVENT_RST)
+		hclge_write_dev(&hdev->hw, HCLGE_MISC_RESET_STS_REG, regclr);
+
+	/* mailbox event sharing vector 0 interrupt would be placed here */
+}
+
 static void hclge_enable_vector(struct hclge_misc_vector *vector, bool enable)
 {
 	writel(enable ? 1 : 0, vector->addr);
@@ -2370,10 +2416,28 @@ static void hclge_enable_vector(struct hclge_misc_vector *vector, bool enable)
 static irqreturn_t hclge_misc_irq_handle(int irq, void *data)
 {
 	struct hclge_dev *hdev = data;
+	u32 event_cause;
+	u32 clearval;
 
 	hclge_enable_vector(&hdev->misc_vector, false);
-	if (!test_and_set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state))
-		schedule_work(&hdev->service_task);
+	event_cause = hclge_check_event_cause(hdev, &clearval);
+
+	/* vector 0 interrupt is shared with reset and mailbox source events.
+	 * For now, we are not handling mailbox events.
+	 */
+	switch (event_cause) {
+	case HCLGE_VECTOR0_EVENT_RST:
+		hclge_reset_task_schedule(hdev);
+		break;
+	default:
+		dev_dbg(&hdev->pdev->dev,
+			"received unknown or unhandled event of vector0\n");
+		break;
+	}
+
+	/* we should clear the source of interrupt */
+	hclge_clear_event_cause(hdev, event_cause, clearval);
+	hclge_enable_vector(&hdev->misc_vector, true);
 
 	return IRQ_HANDLED;
 }
@@ -2404,9 +2468,9 @@ static int hclge_misc_irq_init(struct hclge_dev *hdev)
 
 	hclge_get_misc_vector(hdev);
 
-	ret = devm_request_irq(&hdev->pdev->dev,
-			       hdev->misc_vector.vector_irq,
-			       hclge_misc_irq_handle, 0, "hclge_misc", hdev);
+	/* this would be explicitly freed in the end */
+	ret = request_irq(hdev->misc_vector.vector_irq, hclge_misc_irq_handle,
+			  0, "hclge_misc", hdev);
 	if (ret) {
 		hclge_free_vector(hdev, 0);
 		dev_err(&hdev->pdev->dev, "request misc irq(%d) fail\n",
@@ -2416,6 +2480,12 @@ static int hclge_misc_irq_init(struct hclge_dev *hdev)
 	return ret;
 }
 
+static void hclge_misc_irq_uninit(struct hclge_dev *hdev)
+{
+	free_irq(hdev->misc_vector.vector_irq, hdev);
+	hclge_free_vector(hdev, 0);
+}
+
 static int hclge_notify_client(struct hclge_dev *hdev,
 			       enum hnae3_reset_notify_type type)
 {
@@ -2471,12 +2541,6 @@ static int hclge_reset_wait(struct hclge_dev *hdev)
 		cnt++;
 	}
 
-	/* must clear reset status register to
-	 * prevent driver detect reset interrupt again
-	 */
-	reg = hclge_read_dev(&hdev->hw, HCLGE_MISC_RESET_STS_REG);
-	hclge_write_dev(&hdev->hw, HCLGE_MISC_RESET_STS_REG, reg);
-
 	if (cnt >= HCLGE_RESET_WAIT_CNT) {
 		dev_warn(&hdev->pdev->dev,
 			 "Wait for reset timeout: %d\n", hdev->reset_type);
@@ -2505,12 +2569,12 @@ static int hclge_func_reset_cmd(struct hclge_dev *hdev, int func_id)
 	return ret;
 }
 
-static void hclge_do_reset(struct hclge_dev *hdev, enum hnae3_reset_type type)
+static void hclge_do_reset(struct hclge_dev *hdev)
 {
 	struct pci_dev *pdev = hdev->pdev;
 	u32 val;
 
-	switch (type) {
+	switch (hdev->reset_type) {
 	case HNAE3_GLOBAL_RESET:
 		val = hclge_read_dev(&hdev->hw, HCLGE_GLOBAL_RESET_REG);
 		hnae_set_bit(val, HCLGE_GLOBAL_RESET_BIT, 1);
@@ -2526,30 +2590,62 @@ static void hclge_do_reset(struct hclge_dev *hdev, enum hnae3_reset_type type)
 	case HNAE3_FUNC_RESET:
 		dev_info(&pdev->dev, "PF Reset requested\n");
 		hclge_func_reset_cmd(hdev, 0);
+		/* schedule again to check later */
+		set_bit(HNAE3_FUNC_RESET, &hdev->reset_pending);
+		hclge_reset_task_schedule(hdev);
 		break;
 	default:
 		dev_warn(&pdev->dev,
-			 "Unsupported reset type: %d\n", type);
+			 "Unsupported reset type: %d\n", hdev->reset_type);
 		break;
 	}
 }
 
-static enum hnae3_reset_type hclge_detected_reset_event(struct hclge_dev *hdev)
+static enum hnae3_reset_type hclge_get_reset_level(struct hclge_dev *hdev,
+						   unsigned long *addr)
 {
 	enum hnae3_reset_type rst_level = HNAE3_NONE_RESET;
-	u32 rst_reg_val;
 
-	rst_reg_val = hclge_read_dev(&hdev->hw, HCLGE_MISC_RESET_STS_REG);
-	if (BIT(HCLGE_VECTOR0_GLOBALRESET_INT_B) & rst_reg_val)
+	/* return the highest priority reset level amongst all */
+	if (test_bit(HNAE3_GLOBAL_RESET, addr))
 		rst_level = HNAE3_GLOBAL_RESET;
-	else if (BIT(HCLGE_VECTOR0_CORERESET_INT_B) & rst_reg_val)
+	else if (test_bit(HNAE3_CORE_RESET, addr))
 		rst_level = HNAE3_CORE_RESET;
-	else if (BIT(HCLGE_VECTOR0_IMPRESET_INT_B) & rst_reg_val)
+	else if (test_bit(HNAE3_IMP_RESET, addr))
 		rst_level = HNAE3_IMP_RESET;
+	else if (test_bit(HNAE3_FUNC_RESET, addr))
+		rst_level = HNAE3_FUNC_RESET;
+
+	/* now, clear all other resets */
+	clear_bit(HNAE3_GLOBAL_RESET, addr);
+	clear_bit(HNAE3_CORE_RESET, addr);
+	clear_bit(HNAE3_IMP_RESET, addr);
+	clear_bit(HNAE3_FUNC_RESET, addr);
 
 	return rst_level;
 }
 
+static void hclge_reset(struct hclge_dev *hdev)
+{
+	/* perform reset of the stack & ae device for a client */
+
+	hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
+
+	if (!hclge_reset_wait(hdev)) {
+		rtnl_lock();
+		hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
+		hclge_reset_ae_dev(hdev->ae_dev);
+		hclge_notify_client(hdev, HNAE3_INIT_CLIENT);
+		rtnl_unlock();
+	} else {
+		/* schedule again to check pending resets later */
+		set_bit(hdev->reset_type, &hdev->reset_pending);
+		hclge_reset_task_schedule(hdev);
+	}
+
+	hclge_notify_client(hdev, HNAE3_UP_CLIENT);
+}
+
 static void hclge_reset_event(struct hnae3_handle *handle,
 			      enum hnae3_reset_type reset)
 {
@@ -2563,14 +2659,9 @@ static void hclge_reset_event(struct hnae3_handle *handle,
 	case HNAE3_FUNC_RESET:
 	case HNAE3_CORE_RESET:
 	case HNAE3_GLOBAL_RESET:
-		if (test_bit(HCLGE_STATE_RESET_INT, &hdev->state)) {
-			dev_err(&hdev->pdev->dev, "Already in reset state");
-			return;
-		}
-		hdev->reset_type = reset;
-		set_bit(HCLGE_STATE_RESET_INT, &hdev->state);
-		set_bit(HCLGE_STATE_SERVICE_SCHED, &hdev->state);
-		schedule_work(&hdev->service_task);
+		/* request reset & schedule reset task */
+		set_bit(reset, &hdev->reset_request);
+		hclge_reset_task_schedule(hdev);
 		break;
 	default:
 		dev_warn(&hdev->pdev->dev, "Unsupported reset event:%d", reset);
@@ -2580,49 +2671,40 @@ static void hclge_reset_event(struct hnae3_handle *handle,
 
 static void hclge_reset_subtask(struct hclge_dev *hdev)
 {
-	bool do_reset;
-
-	do_reset = hdev->reset_type != HNAE3_NONE_RESET;
-
-	/* Reset is detected by interrupt */
-	if (hdev->reset_type == HNAE3_NONE_RESET)
-		hdev->reset_type = hclge_detected_reset_event(hdev);
-
-	if (hdev->reset_type == HNAE3_NONE_RESET)
-		return;
-
-	switch (hdev->reset_type) {
-	case HNAE3_FUNC_RESET:
-	case HNAE3_CORE_RESET:
-	case HNAE3_GLOBAL_RESET:
-	case HNAE3_IMP_RESET:
-		hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
+	/* check if there is any ongoing reset in the hardware. This status can
+	 * be checked from reset_pending. If there is then, we need to wait for
+	 * hardware to complete reset.
+	 *    a. If we are able to figure out in reasonable time that hardware
+	 *       has fully resetted then, we can proceed with driver, client
+	 *       reset.
+	 *    b. else, we can come back later to check this status so re-sched
+	 *       now.
+	 */
+	hdev->reset_type = hclge_get_reset_level(hdev, &hdev->reset_pending);
+	if (hdev->reset_type != HNAE3_NONE_RESET)
+		hclge_reset(hdev);
 
-		if (do_reset)
-			hclge_do_reset(hdev, hdev->reset_type);
-		else
-			set_bit(HCLGE_STATE_RESET_INT, &hdev->state);
+	/* check if we got any *new* reset requests to be honored */
+	hdev->reset_type = hclge_get_reset_level(hdev, &hdev->reset_request);
+	if (hdev->reset_type != HNAE3_NONE_RESET)
+		hclge_do_reset(hdev);
 
-		if (!hclge_reset_wait(hdev)) {
-			hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
-			hclge_reset_ae_dev(hdev->ae_dev);
-			hclge_notify_client(hdev, HNAE3_INIT_CLIENT);
-			clear_bit(HCLGE_STATE_RESET_INT, &hdev->state);
-		}
-		hclge_notify_client(hdev, HNAE3_UP_CLIENT);
-		break;
-	default:
-		dev_err(&hdev->pdev->dev, "Unsupported reset type:%d\n",
-			hdev->reset_type);
-		break;
-	}
 	hdev->reset_type = HNAE3_NONE_RESET;
 }
 
-static void hclge_misc_irq_service_task(struct hclge_dev *hdev)
+static void hclge_reset_service_task(struct work_struct *work)
 {
+	struct hclge_dev *hdev =
+		container_of(work, struct hclge_dev, rst_service_task);
+
+	if (test_and_set_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
+		return;
+
+	clear_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev->state);
+
 	hclge_reset_subtask(hdev);
-	hclge_enable_vector(&hdev->misc_vector, true);
+
+	clear_bit(HCLGE_STATE_RST_HANDLING, &hdev->state);
 }
 
 static void hclge_service_task(struct work_struct *work)
@@ -2630,7 +2712,6 @@ static void hclge_service_task(struct work_struct *work)
 	struct hclge_dev *hdev =
 		container_of(work, struct hclge_dev, service_task);
 
-	hclge_misc_irq_service_task(hdev);
 	hclge_update_speed_duplex(hdev);
 	hclge_update_link_status(hdev);
 	hclge_update_stats_for_all(hdev);
@@ -4661,6 +4742,8 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
 	hdev->pdev = pdev;
 	hdev->ae_dev = ae_dev;
 	hdev->reset_type = HNAE3_NONE_RESET;
+	hdev->reset_request = 0;
+	hdev->reset_pending = 0;
 	ae_dev->priv = hdev;
 
 	ret = hclge_pci_init(hdev);
@@ -4772,12 +4855,15 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
 
 	timer_setup(&hdev->service_timer, hclge_service_timer, 0);
 	INIT_WORK(&hdev->service_task, hclge_service_task);
+	INIT_WORK(&hdev->rst_service_task, hclge_reset_service_task);
 
 	/* Enable MISC vector(vector0) */
 	hclge_enable_vector(&hdev->misc_vector, true);
 
 	set_bit(HCLGE_STATE_SERVICE_INITED, &hdev->state);
 	set_bit(HCLGE_STATE_DOWN, &hdev->state);
+	clear_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev->state);
+	clear_bit(HCLGE_STATE_RST_HANDLING, &hdev->state);
 
 	pr_info("%s driver initialization finished.\n", HCLGE_DRIVER_NAME);
 	return 0;
@@ -4889,14 +4975,16 @@ static void hclge_uninit_ae_dev(struct hnae3_ae_dev *ae_dev)
 		del_timer_sync(&hdev->service_timer);
 	if (hdev->service_task.func)
 		cancel_work_sync(&hdev->service_task);
+	if (hdev->rst_service_task.func)
+		cancel_work_sync(&hdev->rst_service_task);
 
 	if (mac->phydev)
 		mdiobus_unregister(mac->mdio_bus);
 
 	/* Disable MISC vector(vector0) */
 	hclge_enable_vector(&hdev->misc_vector, false);
-	hclge_free_vector(hdev, 0);
 	hclge_destroy_cmd_queue(&hdev->hw);
+	hclge_misc_irq_uninit(hdev);
 	hclge_pci_uninit(hdev);
 	ae_dev->priv = NULL;
 }
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index 7027814ea5d7..aacec438b933 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -99,12 +99,19 @@ enum HCLGE_DEV_STATE {
 	HCLGE_STATE_REMOVING,
 	HCLGE_STATE_SERVICE_INITED,
 	HCLGE_STATE_SERVICE_SCHED,
+	HCLGE_STATE_RST_SERVICE_SCHED,
+	HCLGE_STATE_RST_HANDLING,
 	HCLGE_STATE_MBX_HANDLING,
 	HCLGE_STATE_MBX_IRQ,
-	HCLGE_STATE_RESET_INT,
 	HCLGE_STATE_MAX
 };
 
+enum hclge_evt_cause {
+	HCLGE_VECTOR0_EVENT_RST,
+	HCLGE_VECTOR0_EVENT_MBX,
+	HCLGE_VECTOR0_EVENT_OTHER,
+};
+
 #define HCLGE_MPF_ENBALE 1
 struct hclge_caps {
 	u16 num_tqp;
@@ -420,6 +427,8 @@ struct hclge_dev {
 	unsigned long state;
 
 	enum hnae3_reset_type reset_type;
+	unsigned long reset_request;	/* reset has been requested */
+	unsigned long reset_pending;	/* client rst is pending to be served */
 	u32 fw_version;
 	u16 num_vmdq_vport;		/* Num vmdq vport this PF has set up */
 	u16 num_tqps;			/* Num task queue pairs of this PF */
@@ -469,6 +478,7 @@ struct hclge_dev {
 	unsigned long service_timer_previous;
 	struct timer_list service_timer;
 	struct work_struct service_task;
+	struct work_struct rst_service_task;
 
 	bool cur_promisc;
 	int num_alloc_vfs;	/* Actual number of VFs allocated */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 62a18914f00f..7737a05c717c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -9101,9 +9101,11 @@ static int parse_tc_actions(struct ixgbe_adapter *adapter,
 
 		/* Redirect to a VF or a offloaded macvlan */
 		if (is_tcf_mirred_egress_redirect(a)) {
-			int ifindex = tcf_mirred_ifindex(a);
+			struct net_device *dev = tcf_mirred_dev(a);
 
-			err = handle_redirect_action(adapter, ifindex, queue,
+			if (!dev)
+				return -EINVAL;
+			err = handle_redirect_action(adapter, dev->ifindex, queue,
 						     action);
 			if (err == 0)
 				return err;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index d2b057a3e512..0f5c012de52e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4308,9 +4308,6 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
 {
 	mlx5e_ipsec_cleanup(priv);
 	mlx5e_vxlan_cleanup(priv);
-
-	if (priv->channels.params.xdp_prog)
-		bpf_prog_put(priv->channels.params.xdp_prog);
 }
 
 static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 55979ec2e88a..3e03d2e8f96a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1982,11 +1982,10 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
 		}
 
 		if (is_tcf_mirred_egress_redirect(a)) {
-			int ifindex = tcf_mirred_ifindex(a);
 			struct net_device *out_dev;
 			struct mlx5e_priv *out_priv;
 
-			out_dev = __dev_get_by_index(dev_net(priv->netdev), ifindex);
+			out_dev = tcf_mirred_dev(a);
 
 			if (switchdev_port_same_parent_id(priv->netdev,
 							  out_dev)) {
@@ -1996,7 +1995,7 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, struct tcf_exts *exts,
 				rpriv = out_priv->ppriv;
 				attr->out_rep = rpriv->rep;
 			} else if (encap) {
-				parse_attr->mirred_ifindex = ifindex;
+				parse_attr->mirred_ifindex = out_dev->ifindex;
 				parse_attr->tun_info = *info;
 				attr->parse_attr = parse_attr;
 				attr->action |= MLX5_FLOW_CONTEXT_ACTION_ENCAP |
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 2d0897b7d860..3b9c8a0437bf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -1571,14 +1571,11 @@ mlxsw_sp_port_add_cls_matchall_mirror(struct mlxsw_sp_port *mlxsw_sp_port,
 				      const struct tc_action *a,
 				      bool ingress)
 {
-	struct net *net = dev_net(mlxsw_sp_port->dev);
 	enum mlxsw_sp_span_type span_type;
 	struct mlxsw_sp_port *to_port;
 	struct net_device *to_dev;
-	int ifindex;
 
-	ifindex = tcf_mirred_ifindex(a);
-	to_dev = __dev_get_by_index(net, ifindex);
+	to_dev = tcf_mirred_dev(a);
 	if (!to_dev) {
 		netdev_err(mlxsw_sp_port->dev, "Could not find requested device\n");
 		return -EINVAL;
@@ -1838,6 +1835,54 @@ static int mlxsw_sp_setup_tc(struct net_device *dev, enum tc_setup_type type,
 	}
 }
 
+
+static int mlxsw_sp_feature_hw_tc(struct net_device *dev, bool enable)
+{
+	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
+
+	if (!enable && (mlxsw_sp_port->acl_rule_count ||
+			!list_empty(&mlxsw_sp_port->mall_tc_list))) {
+		netdev_err(dev, "Active offloaded tc filters, can't turn hw_tc_offload off\n");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+typedef int (*mlxsw_sp_feature_handler)(struct net_device *dev, bool enable);
+
+static int mlxsw_sp_handle_feature(struct net_device *dev,
+				   netdev_features_t wanted_features,
+				   netdev_features_t feature,
+				   mlxsw_sp_feature_handler feature_handler)
+{
+	netdev_features_t changes = wanted_features ^ dev->features;
+	bool enable = !!(wanted_features & feature);
+	int err;
+
+	if (!(changes & feature))
+		return 0;
+
+	err = feature_handler(dev, enable);
+	if (err) {
+		netdev_err(dev, "%s feature %pNF failed, err %d\n",
+			   enable ? "Enable" : "Disable", &feature, err);
+		return err;
+	}
+
+	if (enable)
+		dev->features |= feature;
+	else
+		dev->features &= ~feature;
+
+	return 0;
+}
+static int mlxsw_sp_set_features(struct net_device *dev,
+				 netdev_features_t features)
+{
+	return mlxsw_sp_handle_feature(dev, features, NETIF_F_HW_TC,
+				       mlxsw_sp_feature_hw_tc);
+}
+
 static const struct net_device_ops mlxsw_sp_port_netdev_ops = {
 	.ndo_open		= mlxsw_sp_port_open,
 	.ndo_stop		= mlxsw_sp_port_stop,
@@ -1852,6 +1897,7 @@ static const struct net_device_ops mlxsw_sp_port_netdev_ops = {
 	.ndo_vlan_rx_add_vid	= mlxsw_sp_port_add_vid,
 	.ndo_vlan_rx_kill_vid	= mlxsw_sp_port_kill_vid,
 	.ndo_get_phys_port_name	= mlxsw_sp_port_get_phys_port_name,
+	.ndo_set_features	= mlxsw_sp_set_features,
 };
 
 static void mlxsw_sp_port_get_drvinfo(struct net_device *dev,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 432ab9b12b7f..a0adcd886589 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -270,6 +270,7 @@ struct mlxsw_sp_port {
 	struct mlxsw_sp_port_sample *sample;
 	struct list_head vlans_list;
 	struct mlxsw_sp_qdisc root_qdisc;
+	unsigned acl_rule_count;
 };
 
 static inline bool
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
index 2f0e57857ea4..42e8a36b9b95 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_flower.c
@@ -92,7 +92,6 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 			if (err)
 				return err;
 		} else if (is_tcf_mirred_egress_redirect(a)) {
-			int ifindex = tcf_mirred_ifindex(a);
 			struct net_device *out_dev;
 			struct mlxsw_sp_fid *fid;
 			u16 fid_index;
@@ -104,7 +103,7 @@ static int mlxsw_sp_flower_parse_actions(struct mlxsw_sp *mlxsw_sp,
 			if (err)
 				return err;
 
-			out_dev = __dev_get_by_index(dev_net(dev), ifindex);
+			out_dev = tcf_mirred_dev(a);
 			if (out_dev == dev)
 				out_dev = NULL;
 
@@ -424,6 +423,7 @@ int mlxsw_sp_flower_replace(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 		goto err_rule_add;
 
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
+	mlxsw_sp_port->acl_rule_count++;
 	return 0;
 
 err_rule_add:
@@ -455,6 +455,7 @@ void mlxsw_sp_flower_destroy(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
 	}
 
 	mlxsw_sp_acl_ruleset_put(mlxsw_sp, ruleset);
+	mlxsw_sp_port->acl_rule_count--;
 }
 
 int mlxsw_sp_flower_stats(struct mlxsw_sp_port *mlxsw_sp_port, bool ingress,
diff --git a/drivers/net/ethernet/netronome/nfp/Makefile b/drivers/net/ethernet/netronome/nfp/Makefile
index 24c4408b5734..6e5ef984398b 100644
--- a/drivers/net/ethernet/netronome/nfp/Makefile
+++ b/drivers/net/ethernet/netronome/nfp/Makefile
@@ -22,6 +22,7 @@ nfp-objs := \
 	    nfp_hwmon.o \
 	    nfp_main.o \
 	    nfp_net_common.o \
+	    nfp_net_debugdump.o \
 	    nfp_net_ethtool.o \
 	    nfp_net_main.o \
 	    nfp_net_repr.o \
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 995e95410b11..3419ad495962 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016 Netronome Systems, Inc.
+ * Copyright (C) 2016-2017 Netronome Systems, Inc.
  *
  * This software is dual licensed under the GNU General License Version 2,
  * June 1991 as shown in the file COPYING in the top-level directory of this
@@ -66,12 +66,6 @@
 	     next2 = nfp_meta_next(next))
 
 static bool
-nfp_meta_has_next(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
-{
-	return meta->l.next != &nfp_prog->insns;
-}
-
-static bool
 nfp_meta_has_prev(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
 	return meta->l.prev != &nfp_prog->insns;
@@ -102,7 +96,7 @@ nfp_prog_offset_to_index(struct nfp_prog *nfp_prog, unsigned int offset)
 /* --- Emitters --- */
 static void
 __emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
-	   u8 mode, u8 xfer, u8 areg, u8 breg, u8 size, bool sync)
+	   u8 mode, u8 xfer, u8 areg, u8 breg, u8 size, bool sync, bool indir)
 {
 	enum cmd_ctx_swap ctx;
 	u64 insn;
@@ -120,14 +114,15 @@ __emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
 		FIELD_PREP(OP_CMD_CNT, size) |
 		FIELD_PREP(OP_CMD_SIG, sync) |
 		FIELD_PREP(OP_CMD_TGT_CMD, cmd_tgt_act[op].tgt_cmd) |
+		FIELD_PREP(OP_CMD_INDIR, indir) |
 		FIELD_PREP(OP_CMD_MODE, mode);
 
 	nfp_prog_push(nfp_prog, insn);
 }
 
 static void
-emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
-	 u8 mode, u8 xfer, swreg lreg, swreg rreg, u8 size, bool sync)
+emit_cmd_any(struct nfp_prog *nfp_prog, enum cmd_tgt_map op, u8 mode, u8 xfer,
+	     swreg lreg, swreg rreg, u8 size, bool sync, bool indir)
 {
 	struct nfp_insn_re_regs reg;
 	int err;
@@ -148,7 +143,22 @@ emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
 		return;
 	}
 
-	__emit_cmd(nfp_prog, op, mode, xfer, reg.areg, reg.breg, size, sync);
+	__emit_cmd(nfp_prog, op, mode, xfer, reg.areg, reg.breg, size, sync,
+		   indir);
+}
+
+static void
+emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op, u8 mode, u8 xfer,
+	 swreg lreg, swreg rreg, u8 size, bool sync)
+{
+	emit_cmd_any(nfp_prog, op, mode, xfer, lreg, rreg, size, sync, false);
+}
+
+static void
+emit_cmd_indir(struct nfp_prog *nfp_prog, enum cmd_tgt_map op, u8 mode, u8 xfer,
+	       swreg lreg, swreg rreg, u8 size, bool sync)
+{
+	emit_cmd_any(nfp_prog, op, mode, xfer, lreg, rreg, size, sync, true);
 }
 
 static void
@@ -230,9 +240,11 @@ emit_immed(struct nfp_prog *nfp_prog, swreg dst, u16 imm,
 		return;
 	}
 
-	__emit_immed(nfp_prog, reg.areg, reg.breg, imm >> 8, width,
-		     invert, shift, reg.wr_both,
-		     reg.dst_lmextn, reg.src_lmextn);
+	/* Use reg.dst when destination is No-Dest. */
+	__emit_immed(nfp_prog,
+		     swreg_type(dst) == NN_REG_NONE ? reg.dst : reg.areg,
+		     reg.breg, imm >> 8, width, invert, shift,
+		     reg.wr_both, reg.dst_lmextn, reg.src_lmextn);
 }
 
 static void
@@ -510,6 +522,147 @@ static void wrp_reg_mov(struct nfp_prog *nfp_prog, u16 dst, u16 src)
 	wrp_mov(nfp_prog, reg_both(dst), reg_b(src));
 }
 
+/* wrp_reg_subpart() - load @field_len bytes from @offset of @src, write the
+ * result to @dst from low end.
+ */
+static void
+wrp_reg_subpart(struct nfp_prog *nfp_prog, swreg dst, swreg src, u8 field_len,
+		u8 offset)
+{
+	enum shf_sc sc = offset ? SHF_SC_R_SHF : SHF_SC_NONE;
+	u8 mask = (1 << field_len) - 1;
+
+	emit_ld_field_any(nfp_prog, dst, mask, src, sc, offset * 8, true);
+}
+
+/* NFP has Command Push Pull bus which supports bluk memory operations. */
+static int nfp_cpp_memcpy(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
+{
+	bool descending_seq = meta->ldst_gather_len < 0;
+	s16 len = abs(meta->ldst_gather_len);
+	swreg src_base, off;
+	unsigned int i;
+	u8 xfer_num;
+
+	off = re_load_imm_any(nfp_prog, meta->insn.off, imm_b(nfp_prog));
+	src_base = reg_a(meta->insn.src_reg * 2);
+	xfer_num = round_up(len, 4) / 4;
+
+	/* Setup PREV_ALU fields to override memory read length. */
+	if (len > 32)
+		wrp_immed(nfp_prog, reg_none(),
+			  CMD_OVE_LEN | FIELD_PREP(CMD_OV_LEN, xfer_num - 1));
+
+	/* Memory read from source addr into transfer-in registers. */
+	emit_cmd_any(nfp_prog, CMD_TGT_READ32_SWAP, CMD_MODE_32b, 0, src_base,
+		     off, xfer_num - 1, true, len > 32);
+
+	/* Move from transfer-in to transfer-out. */
+	for (i = 0; i < xfer_num; i++)
+		wrp_mov(nfp_prog, reg_xfer(i), reg_xfer(i));
+
+	off = re_load_imm_any(nfp_prog, meta->paired_st->off, imm_b(nfp_prog));
+
+	if (len <= 8) {
+		/* Use single direct_ref write8. */
+		emit_cmd(nfp_prog, CMD_TGT_WRITE8_SWAP, CMD_MODE_32b, 0,
+			 reg_a(meta->paired_st->dst_reg * 2), off, len - 1,
+			 true);
+	} else if (len <= 32 && IS_ALIGNED(len, 4)) {
+		/* Use single direct_ref write32. */
+		emit_cmd(nfp_prog, CMD_TGT_WRITE32_SWAP, CMD_MODE_32b, 0,
+			 reg_a(meta->paired_st->dst_reg * 2), off, xfer_num - 1,
+			 true);
+	} else if (len <= 32) {
+		/* Use single indirect_ref write8. */
+		wrp_immed(nfp_prog, reg_none(),
+			  CMD_OVE_LEN | FIELD_PREP(CMD_OV_LEN, len - 1));
+		emit_cmd_indir(nfp_prog, CMD_TGT_WRITE8_SWAP, CMD_MODE_32b, 0,
+			       reg_a(meta->paired_st->dst_reg * 2), off,
+			       len - 1, true);
+	} else if (IS_ALIGNED(len, 4)) {
+		/* Use single indirect_ref write32. */
+		wrp_immed(nfp_prog, reg_none(),
+			  CMD_OVE_LEN | FIELD_PREP(CMD_OV_LEN, xfer_num - 1));
+		emit_cmd_indir(nfp_prog, CMD_TGT_WRITE32_SWAP, CMD_MODE_32b, 0,
+			       reg_a(meta->paired_st->dst_reg * 2), off,
+			       xfer_num - 1, true);
+	} else if (len <= 40) {
+		/* Use one direct_ref write32 to write the first 32-bytes, then
+		 * another direct_ref write8 to write the remaining bytes.
+		 */
+		emit_cmd(nfp_prog, CMD_TGT_WRITE32_SWAP, CMD_MODE_32b, 0,
+			 reg_a(meta->paired_st->dst_reg * 2), off, 7,
+			 true);
+
+		off = re_load_imm_any(nfp_prog, meta->paired_st->off + 32,
+				      imm_b(nfp_prog));
+		emit_cmd(nfp_prog, CMD_TGT_WRITE8_SWAP, CMD_MODE_32b, 8,
+			 reg_a(meta->paired_st->dst_reg * 2), off, len - 33,
+			 true);
+	} else {
+		/* Use one indirect_ref write32 to write 4-bytes aligned length,
+		 * then another direct_ref write8 to write the remaining bytes.
+		 */
+		u8 new_off;
+
+		wrp_immed(nfp_prog, reg_none(),
+			  CMD_OVE_LEN | FIELD_PREP(CMD_OV_LEN, xfer_num - 2));
+		emit_cmd_indir(nfp_prog, CMD_TGT_WRITE32_SWAP, CMD_MODE_32b, 0,
+			       reg_a(meta->paired_st->dst_reg * 2), off,
+			       xfer_num - 2, true);
+		new_off = meta->paired_st->off + (xfer_num - 1) * 4;
+		off = re_load_imm_any(nfp_prog, new_off, imm_b(nfp_prog));
+		emit_cmd(nfp_prog, CMD_TGT_WRITE8_SWAP, CMD_MODE_32b,
+			 xfer_num - 1, reg_a(meta->paired_st->dst_reg * 2), off,
+			 (len & 0x3) - 1, true);
+	}
+
+	/* TODO: The following extra load is to make sure data flow be identical
+	 *  before and after we do memory copy optimization.
+	 *
+	 *  The load destination register is not guaranteed to be dead, so we
+	 *  need to make sure it is loaded with the value the same as before
+	 *  this transformation.
+	 *
+	 *  These extra loads could be removed once we have accurate register
+	 *  usage information.
+	 */
+	if (descending_seq)
+		xfer_num = 0;
+	else if (BPF_SIZE(meta->insn.code) != BPF_DW)
+		xfer_num = xfer_num - 1;
+	else
+		xfer_num = xfer_num - 2;
+
+	switch (BPF_SIZE(meta->insn.code)) {
+	case BPF_B:
+		wrp_reg_subpart(nfp_prog, reg_both(meta->insn.dst_reg * 2),
+				reg_xfer(xfer_num), 1,
+				IS_ALIGNED(len, 4) ? 3 : (len & 3) - 1);
+		break;
+	case BPF_H:
+		wrp_reg_subpart(nfp_prog, reg_both(meta->insn.dst_reg * 2),
+				reg_xfer(xfer_num), 2, (len & 3) ^ 2);
+		break;
+	case BPF_W:
+		wrp_mov(nfp_prog, reg_both(meta->insn.dst_reg * 2),
+			reg_xfer(0));
+		break;
+	case BPF_DW:
+		wrp_mov(nfp_prog, reg_both(meta->insn.dst_reg * 2),
+			reg_xfer(xfer_num));
+		wrp_mov(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1),
+			reg_xfer(xfer_num + 1));
+		break;
+	}
+
+	if (BPF_SIZE(meta->insn.code) != BPF_DW)
+		wrp_immed(nfp_prog, reg_both(meta->insn.dst_reg * 2 + 1), 0);
+
+	return 0;
+}
+
 static int
 data_ld(struct nfp_prog *nfp_prog, swreg offset, u8 dst_gpr, int size)
 {
@@ -975,9 +1128,6 @@ wrp_test_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 {
 	const struct bpf_insn *insn = &meta->insn;
 
-	if (insn->off < 0) /* TODO */
-		return -EOPNOTSUPP;
-
 	wrp_test_reg_one(nfp_prog, insn->dst_reg * 2, alu_op,
 			 insn->src_reg * 2, br_mask, insn->off);
 	wrp_test_reg_one(nfp_prog, insn->dst_reg * 2 + 1, alu_op,
@@ -995,9 +1145,6 @@ wrp_cmp_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 	u8 reg = insn->dst_reg * 2;
 	swreg tmp_reg;
 
-	if (insn->off < 0) /* TODO */
-		return -EOPNOTSUPP;
-
 	tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
 	if (!swap)
 		emit_alu(nfp_prog, reg_none(), reg_a(reg), ALU_OP_SUB, tmp_reg);
@@ -1027,9 +1174,6 @@ wrp_cmp_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 	areg = insn->dst_reg * 2;
 	breg = insn->src_reg * 2;
 
-	if (insn->off < 0) /* TODO */
-		return -EOPNOTSUPP;
-
 	if (swap) {
 		areg ^= breg;
 		breg ^= areg;
@@ -1494,6 +1638,9 @@ static int
 mem_ldx(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 	unsigned int size)
 {
+	if (meta->ldst_gather_len)
+		return nfp_cpp_memcpy(nfp_prog, meta);
+
 	if (meta->ptr.type == PTR_TO_CTX) {
 		if (nfp_prog->type == BPF_PROG_TYPE_XDP)
 			return mem_ldx_xdp(nfp_prog, meta, size);
@@ -1630,8 +1777,6 @@ static int mem_stx8(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 
 static int jump(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
-	if (meta->insn.off < 0) /* TODO */
-		return -EOPNOTSUPP;
 	emit_br(nfp_prog, BR_UNC, meta->insn.off, 0);
 
 	return 0;
@@ -1646,9 +1791,6 @@ static int jeq_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	or1 = reg_a(insn->dst_reg * 2);
 	or2 = reg_b(insn->dst_reg * 2 + 1);
 
-	if (insn->off < 0) /* TODO */
-		return -EOPNOTSUPP;
-
 	if (imm & ~0U) {
 		tmp_reg = ur_load_imm_any(nfp_prog, imm & ~0U, imm_b(nfp_prog));
 		emit_alu(nfp_prog, imm_a(nfp_prog),
@@ -1695,9 +1837,6 @@ static int jset_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	u64 imm = insn->imm; /* sign extend */
 	swreg tmp_reg;
 
-	if (insn->off < 0) /* TODO */
-		return -EOPNOTSUPP;
-
 	if (!imm) {
 		meta->skip = true;
 		return 0;
@@ -1726,9 +1865,6 @@ static int jne_imm(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 	u64 imm = insn->imm; /* sign extend */
 	swreg tmp_reg;
 
-	if (insn->off < 0) /* TODO */
-		return -EOPNOTSUPP;
-
 	if (!imm) {
 		emit_alu(nfp_prog, reg_none(), reg_a(insn->dst_reg * 2),
 			 ALU_OP_OR, reg_b(insn->dst_reg * 2 + 1));
@@ -1753,9 +1889,6 @@ static int jeq_reg(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
 	const struct bpf_insn *insn = &meta->insn;
 
-	if (insn->off < 0) /* TODO */
-		return -EOPNOTSUPP;
-
 	emit_alu(nfp_prog, imm_a(nfp_prog), reg_a(insn->dst_reg * 2),
 		 ALU_OP_XOR, reg_b(insn->src_reg * 2));
 	emit_alu(nfp_prog, imm_b(nfp_prog), reg_a(insn->dst_reg * 2 + 1),
@@ -1887,17 +2020,22 @@ static void br_set_offset(u64 *instr, u16 offset)
 /* --- Assembler logic --- */
 static int nfp_fixup_branches(struct nfp_prog *nfp_prog)
 {
-	struct nfp_insn_meta *meta, *next;
-	u32 off, br_idx;
-	u32 idx;
+	struct nfp_insn_meta *meta, *jmp_dst;
+	u32 idx, br_idx;
 
-	nfp_for_each_insn_walk2(nfp_prog, meta, next) {
+	list_for_each_entry(meta, &nfp_prog->insns, l) {
 		if (meta->skip)
 			continue;
 		if (BPF_CLASS(meta->insn.code) != BPF_JMP)
 			continue;
 
-		br_idx = nfp_prog_offset_to_index(nfp_prog, next->off) - 1;
+		if (list_is_last(&meta->l, &nfp_prog->insns))
+			idx = nfp_prog->last_bpf_off;
+		else
+			idx = list_next_entry(meta, l)->off - 1;
+
+		br_idx = nfp_prog_offset_to_index(nfp_prog, idx);
+
 		if (!nfp_is_br(nfp_prog->prog[br_idx])) {
 			pr_err("Fixup found block not ending in branch %d %02x %016llx!!\n",
 			       br_idx, meta->insn.code, nfp_prog->prog[br_idx]);
@@ -1907,23 +2045,14 @@ static int nfp_fixup_branches(struct nfp_prog *nfp_prog)
 		if (FIELD_GET(OP_BR_SPECIAL, nfp_prog->prog[br_idx]))
 			continue;
 
-		/* Find the target offset in assembler realm */
-		off = meta->insn.off;
-		if (!off) {
-			pr_err("Fixup found zero offset!!\n");
+		if (!meta->jmp_dst) {
+			pr_err("Non-exit jump doesn't have destination info recorded!!\n");
 			return -ELOOP;
 		}
 
-		while (off && nfp_meta_has_next(nfp_prog, next)) {
-			next = nfp_meta_next(next);
-			off--;
-		}
-		if (off) {
-			pr_err("Fixup found too large jump!! %d\n", off);
-			return -ELOOP;
-		}
+		jmp_dst = meta->jmp_dst;
 
-		if (next->skip) {
+		if (jmp_dst->skip) {
 			pr_err("Branch landing on removed instruction!!\n");
 			return -ELOOP;
 		}
@@ -1932,7 +2061,7 @@ static int nfp_fixup_branches(struct nfp_prog *nfp_prog)
 		     idx <= br_idx; idx++) {
 			if (!nfp_is_br(nfp_prog->prog[idx]))
 				continue;
-			br_set_offset(&nfp_prog->prog[idx], next->off);
+			br_set_offset(&nfp_prog->prog[idx], jmp_dst->off);
 		}
 	}
 
@@ -2105,6 +2234,8 @@ static int nfp_translate(struct nfp_prog *nfp_prog)
 		nfp_prog->n_translated++;
 	}
 
+	nfp_prog->last_bpf_off = nfp_prog_current_offset(nfp_prog) - 1;
+
 	nfp_outro(nfp_prog);
 	if (nfp_prog->error)
 		return nfp_prog->error;
@@ -2173,6 +2304,9 @@ static void nfp_bpf_opt_ld_mask(struct nfp_prog *nfp_prog)
 		if (next.src_reg || next.dst_reg)
 			continue;
 
+		if (meta2->flags & FLAG_INSN_IS_JUMP_DST)
+			continue;
+
 		meta2->skip = true;
 	}
 }
@@ -2209,17 +2343,258 @@ static void nfp_bpf_opt_ld_shift(struct nfp_prog *nfp_prog)
 		if (next1.imm != 0x20 || next2.imm != 0x20)
 			continue;
 
+		if (meta2->flags & FLAG_INSN_IS_JUMP_DST ||
+		    meta3->flags & FLAG_INSN_IS_JUMP_DST)
+			continue;
+
 		meta2->skip = true;
 		meta3->skip = true;
 	}
 }
 
+/* load/store pair that forms memory copy sould look like the following:
+ *
+ *   ld_width R, [addr_src + offset_src]
+ *   st_width [addr_dest + offset_dest], R
+ *
+ * The destination register of load and source register of store should
+ * be the same, load and store should also perform at the same width.
+ * If either of addr_src or addr_dest is stack pointer, we don't do the
+ * CPP optimization as stack is modelled by registers on NFP.
+ */
+static bool
+curr_pair_is_memcpy(struct nfp_insn_meta *ld_meta,
+		    struct nfp_insn_meta *st_meta)
+{
+	struct bpf_insn *ld = &ld_meta->insn;
+	struct bpf_insn *st = &st_meta->insn;
+
+	if (!is_mbpf_load(ld_meta) || !is_mbpf_store(st_meta))
+		return false;
+
+	if (ld_meta->ptr.type != PTR_TO_PACKET)
+		return false;
+
+	if (st_meta->ptr.type != PTR_TO_PACKET)
+		return false;
+
+	if (BPF_SIZE(ld->code) != BPF_SIZE(st->code))
+		return false;
+
+	if (ld->dst_reg != st->src_reg)
+		return false;
+
+	/* There is jump to the store insn in this pair. */
+	if (st_meta->flags & FLAG_INSN_IS_JUMP_DST)
+		return false;
+
+	return true;
+}
+
+/* Currently, we only support chaining load/store pairs if:
+ *
+ *  - Their address base registers are the same.
+ *  - Their address offsets are in the same order.
+ *  - They operate at the same memory width.
+ *  - There is no jump into the middle of them.
+ */
+static bool
+curr_pair_chain_with_previous(struct nfp_insn_meta *ld_meta,
+			      struct nfp_insn_meta *st_meta,
+			      struct bpf_insn *prev_ld,
+			      struct bpf_insn *prev_st)
+{
+	u8 prev_size, curr_size, prev_ld_base, prev_st_base, prev_ld_dst;
+	struct bpf_insn *ld = &ld_meta->insn;
+	struct bpf_insn *st = &st_meta->insn;
+	s16 prev_ld_off, prev_st_off;
+
+	/* This pair is the start pair. */
+	if (!prev_ld)
+		return true;
+
+	prev_size = BPF_LDST_BYTES(prev_ld);
+	curr_size = BPF_LDST_BYTES(ld);
+	prev_ld_base = prev_ld->src_reg;
+	prev_st_base = prev_st->dst_reg;
+	prev_ld_dst = prev_ld->dst_reg;
+	prev_ld_off = prev_ld->off;
+	prev_st_off = prev_st->off;
+
+	if (ld->dst_reg != prev_ld_dst)
+		return false;
+
+	if (ld->src_reg != prev_ld_base || st->dst_reg != prev_st_base)
+		return false;
+
+	if (curr_size != prev_size)
+		return false;
+
+	/* There is jump to the head of this pair. */
+	if (ld_meta->flags & FLAG_INSN_IS_JUMP_DST)
+		return false;
+
+	/* Both in ascending order. */
+	if (prev_ld_off + prev_size == ld->off &&
+	    prev_st_off + prev_size == st->off)
+		return true;
+
+	/* Both in descending order. */
+	if (ld->off + curr_size == prev_ld_off &&
+	    st->off + curr_size == prev_st_off)
+		return true;
+
+	return false;
+}
+
+/* Return TRUE if cross memory access happens. Cross memory access means
+ * store area is overlapping with load area that a later load might load
+ * the value from previous store, for this case we can't treat the sequence
+ * as an memory copy.
+ */
+static bool
+cross_mem_access(struct bpf_insn *ld, struct nfp_insn_meta *head_ld_meta,
+		 struct nfp_insn_meta *head_st_meta)
+{
+	s16 head_ld_off, head_st_off, ld_off;
+
+	/* Different pointer types does not overlap. */
+	if (head_ld_meta->ptr.type != head_st_meta->ptr.type)
+		return false;
+
+	/* load and store are both PTR_TO_PACKET, check ID info.  */
+	if (head_ld_meta->ptr.id != head_st_meta->ptr.id)
+		return true;
+
+	/* Canonicalize the offsets. Turn all of them against the original
+	 * base register.
+	 */
+	head_ld_off = head_ld_meta->insn.off + head_ld_meta->ptr.off;
+	head_st_off = head_st_meta->insn.off + head_st_meta->ptr.off;
+	ld_off = ld->off + head_ld_meta->ptr.off;
+
+	/* Ascending order cross. */
+	if (ld_off > head_ld_off &&
+	    head_ld_off < head_st_off && ld_off >= head_st_off)
+		return true;
+
+	/* Descending order cross. */
+	if (ld_off < head_ld_off &&
+	    head_ld_off > head_st_off && ld_off <= head_st_off)
+		return true;
+
+	return false;
+}
+
+/* This pass try to identify the following instructoin sequences.
+ *
+ *   load R, [regA + offA]
+ *   store [regB + offB], R
+ *   load R, [regA + offA + const_imm_A]
+ *   store [regB + offB + const_imm_A], R
+ *   load R, [regA + offA + 2 * const_imm_A]
+ *   store [regB + offB + 2 * const_imm_A], R
+ *   ...
+ *
+ * Above sequence is typically generated by compiler when lowering
+ * memcpy. NFP prefer using CPP instructions to accelerate it.
+ */
+static void nfp_bpf_opt_ldst_gather(struct nfp_prog *nfp_prog)
+{
+	struct nfp_insn_meta *head_ld_meta = NULL;
+	struct nfp_insn_meta *head_st_meta = NULL;
+	struct nfp_insn_meta *meta1, *meta2;
+	struct bpf_insn *prev_ld = NULL;
+	struct bpf_insn *prev_st = NULL;
+	u8 count = 0;
+
+	nfp_for_each_insn_walk2(nfp_prog, meta1, meta2) {
+		struct bpf_insn *ld = &meta1->insn;
+		struct bpf_insn *st = &meta2->insn;
+
+		/* Reset record status if any of the following if true:
+		 *   - The current insn pair is not load/store.
+		 *   - The load/store pair doesn't chain with previous one.
+		 *   - The chained load/store pair crossed with previous pair.
+		 *   - The chained load/store pair has a total size of memory
+		 *     copy beyond 128 bytes which is the maximum length a
+		 *     single NFP CPP command can transfer.
+		 */
+		if (!curr_pair_is_memcpy(meta1, meta2) ||
+		    !curr_pair_chain_with_previous(meta1, meta2, prev_ld,
+						   prev_st) ||
+		    (head_ld_meta && (cross_mem_access(ld, head_ld_meta,
+						       head_st_meta) ||
+				      head_ld_meta->ldst_gather_len >= 128))) {
+			if (!count)
+				continue;
+
+			if (count > 1) {
+				s16 prev_ld_off = prev_ld->off;
+				s16 prev_st_off = prev_st->off;
+				s16 head_ld_off = head_ld_meta->insn.off;
+
+				if (prev_ld_off < head_ld_off) {
+					head_ld_meta->insn.off = prev_ld_off;
+					head_st_meta->insn.off = prev_st_off;
+					head_ld_meta->ldst_gather_len =
+						-head_ld_meta->ldst_gather_len;
+				}
+
+				head_ld_meta->paired_st = &head_st_meta->insn;
+				head_st_meta->skip = true;
+			} else {
+				head_ld_meta->ldst_gather_len = 0;
+			}
+
+			/* If the chain is ended by an load/store pair then this
+			 * could serve as the new head of the the next chain.
+			 */
+			if (curr_pair_is_memcpy(meta1, meta2)) {
+				head_ld_meta = meta1;
+				head_st_meta = meta2;
+				head_ld_meta->ldst_gather_len =
+					BPF_LDST_BYTES(ld);
+				meta1 = nfp_meta_next(meta1);
+				meta2 = nfp_meta_next(meta2);
+				prev_ld = ld;
+				prev_st = st;
+				count = 1;
+			} else {
+				head_ld_meta = NULL;
+				head_st_meta = NULL;
+				prev_ld = NULL;
+				prev_st = NULL;
+				count = 0;
+			}
+
+			continue;
+		}
+
+		if (!head_ld_meta) {
+			head_ld_meta = meta1;
+			head_st_meta = meta2;
+		} else {
+			meta1->skip = true;
+			meta2->skip = true;
+		}
+
+		head_ld_meta->ldst_gather_len += BPF_LDST_BYTES(ld);
+		meta1 = nfp_meta_next(meta1);
+		meta2 = nfp_meta_next(meta2);
+		prev_ld = ld;
+		prev_st = st;
+		count++;
+	}
+}
+
 static int nfp_bpf_optimize(struct nfp_prog *nfp_prog)
 {
 	nfp_bpf_opt_reg_init(nfp_prog);
 
 	nfp_bpf_opt_ld_mask(nfp_prog);
 	nfp_bpf_opt_ld_shift(nfp_prog);
+	nfp_bpf_opt_ldst_gather(nfp_prog);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.c b/drivers/net/ethernet/netronome/nfp/bpf/main.c
index e379b78e86ef..54bfd7846f6d 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.c
@@ -82,12 +82,6 @@ static const char *nfp_bpf_extra_cap(struct nfp_app *app, struct nfp_net *nn)
 	return nfp_net_ebpf_capable(nn) ? "BPF" : "";
 }
 
-static void nfp_bpf_vnic_free(struct nfp_app *app, struct nfp_net *nn)
-{
-	if (nn->dp.bpf_offload_xdp)
-		nfp_bpf_xdp_offload(app, nn, NULL);
-}
-
 static int nfp_bpf_setup_tc_block_cb(enum tc_setup_type type,
 				     void *type_data, void *cb_priv)
 {
@@ -168,7 +162,6 @@ const struct nfp_app_type app_bpf = {
 	.extra_cap	= nfp_bpf_extra_cap,
 
 	.vnic_alloc	= nfp_app_nic_vnic_alloc,
-	.vnic_free	= nfp_bpf_vnic_free,
 
 	.setup_tc	= nfp_bpf_setup_tc,
 	.tc_busy	= nfp_bpf_tc_busy,
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h b/drivers/net/ethernet/netronome/nfp/bpf/main.h
index 082a15f6dfb5..5884291ddba5 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.h
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016 Netronome Systems, Inc.
+ * Copyright (C) 2016-2017 Netronome Systems, Inc.
  *
  * This software is dual licensed under the GNU General License Version 2,
  * June 1991 as shown in the file COPYING in the top-level directory of this
@@ -89,23 +89,37 @@ typedef int (*instr_cb_t)(struct nfp_prog *, struct nfp_insn_meta *);
 #define nfp_meta_next(meta)	list_next_entry(meta, l)
 #define nfp_meta_prev(meta)	list_prev_entry(meta, l)
 
+#define FLAG_INSN_IS_JUMP_DST	BIT(0)
+
 /**
  * struct nfp_insn_meta - BPF instruction wrapper
  * @insn: BPF instruction
  * @ptr: pointer type for memory operations
+ * @ldst_gather_len: memcpy length gathered from load/store sequence
+ * @paired_st: the paired store insn at the head of the sequence
  * @ptr_not_const: pointer is not always constant
+ * @jmp_dst: destination info for jump instructions
  * @off: index of first generated machine instruction (in nfp_prog.prog)
  * @n: eBPF instruction number
+ * @flags: eBPF instruction extra optimization flags
  * @skip: skip this instruction (optimized out)
  * @double_cb: callback for second part of the instruction
  * @l: link on nfp_prog->insns list
  */
 struct nfp_insn_meta {
 	struct bpf_insn insn;
-	struct bpf_reg_state ptr;
-	bool ptr_not_const;
+	union {
+		struct {
+			struct bpf_reg_state ptr;
+			struct bpf_insn *paired_st;
+			s16 ldst_gather_len;
+			bool ptr_not_const;
+		};
+		struct nfp_insn_meta *jmp_dst;
+	};
 	unsigned int off;
 	unsigned short n;
+	unsigned short flags;
 	bool skip;
 	instr_cb_t double_cb;
 
@@ -134,6 +148,16 @@ static inline u8 mbpf_mode(const struct nfp_insn_meta *meta)
 	return BPF_MODE(meta->insn.code);
 }
 
+static inline bool is_mbpf_load(const struct nfp_insn_meta *meta)
+{
+	return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_LDX | BPF_MEM);
+}
+
+static inline bool is_mbpf_store(const struct nfp_insn_meta *meta)
+{
+	return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_MEM);
+}
+
 /**
  * struct nfp_prog - nfp BPF program
  * @prog: machine code
@@ -142,6 +166,7 @@ static inline u8 mbpf_mode(const struct nfp_insn_meta *meta)
  * @verifier_meta: temporary storage for verifier's insn meta
  * @type: BPF program type
  * @start_off: address of the first instruction in the memory
+ * @last_bpf_off: address of the last instruction translated from BPF
  * @tgt_out: jump target for normal exit
  * @tgt_abort: jump target for abort (e.g. access outside of packet buffer)
  * @tgt_done: jump target to get the next packet
@@ -160,6 +185,7 @@ struct nfp_prog {
 	enum bpf_prog_type type;
 
 	unsigned int start_off;
+	unsigned int last_bpf_off;
 	unsigned int tgt_out;
 	unsigned int tgt_abort;
 	unsigned int tgt_done;
@@ -189,4 +215,7 @@ int nfp_bpf_translate(struct nfp_app *app, struct nfp_net *nn,
 		      struct bpf_prog *prog);
 int nfp_bpf_destroy(struct nfp_app *app, struct nfp_net *nn,
 		    struct bpf_prog *prog);
+struct nfp_insn_meta *
+nfp_bpf_goto_meta(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
+		  unsigned int insn_idx, unsigned int n_insns);
 #endif
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/offload.c b/drivers/net/ethernet/netronome/nfp/bpf/offload.c
index bc879aeb62d4..377976ce92dd 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/offload.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016 Netronome Systems, Inc.
+ * Copyright (C) 2016-2017 Netronome Systems, Inc.
  *
  * This software is dual licensed under the GNU General License Version 2,
  * June 1991 as shown in the file COPYING in the top-level directory of this
@@ -55,11 +55,10 @@ static int
 nfp_prog_prepare(struct nfp_prog *nfp_prog, const struct bpf_insn *prog,
 		 unsigned int cnt)
 {
+	struct nfp_insn_meta *meta;
 	unsigned int i;
 
 	for (i = 0; i < cnt; i++) {
-		struct nfp_insn_meta *meta;
-
 		meta = kzalloc(sizeof(*meta), GFP_KERNEL);
 		if (!meta)
 			return -ENOMEM;
@@ -70,6 +69,24 @@ nfp_prog_prepare(struct nfp_prog *nfp_prog, const struct bpf_insn *prog,
 		list_add_tail(&meta->l, &nfp_prog->insns);
 	}
 
+	/* Another pass to record jump information. */
+	list_for_each_entry(meta, &nfp_prog->insns, l) {
+		u64 code = meta->insn.code;
+
+		if (BPF_CLASS(code) == BPF_JMP && BPF_OP(code) != BPF_EXIT &&
+		    BPF_OP(code) != BPF_CALL) {
+			struct nfp_insn_meta *dst_meta;
+			unsigned short dst_indx;
+
+			dst_indx = meta->n + 1 + meta->insn.off;
+			dst_meta = nfp_bpf_goto_meta(nfp_prog, meta, dst_indx,
+						     cnt);
+
+			meta->jmp_dst = dst_meta;
+			dst_meta->flags |= FLAG_INSN_IS_JUMP_DST;
+		}
+	}
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
index 8d43491ddd6b..d2bf29c90226 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016 Netronome Systems, Inc.
+ * Copyright (C) 2016-2017 Netronome Systems, Inc.
  *
  * This software is dual licensed under the GNU General License Version 2,
  * June 1991 as shown in the file COPYING in the top-level directory of this
@@ -40,7 +40,7 @@
 
 #include "main.h"
 
-static struct nfp_insn_meta *
+struct nfp_insn_meta *
 nfp_bpf_goto_meta(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta,
 		  unsigned int insn_idx, unsigned int n_insns)
 {
@@ -180,10 +180,10 @@ nfp_verify_insn(struct bpf_verifier_env *env, int insn_idx, int prev_insn_idx)
 	if (meta->insn.code == (BPF_JMP | BPF_EXIT))
 		return nfp_bpf_check_exit(nfp_prog, env);
 
-	if ((meta->insn.code & ~BPF_SIZE_MASK) == (BPF_LDX | BPF_MEM))
+	if (is_mbpf_load(meta))
 		return nfp_bpf_check_ptr(nfp_prog, meta, env,
 					 meta->insn.src_reg);
-	if ((meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_MEM))
+	if (is_mbpf_store(meta))
 		return nfp_bpf_check_ptr(nfp_prog, meta, env,
 					 meta->insn.dst_reg);
 
diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index c1c595f8bb87..ca74c517f626 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -93,13 +93,11 @@ nfp_fl_output(struct nfp_fl_output *output, const struct tc_action *action,
 	size_t act_size = sizeof(struct nfp_fl_output);
 	struct net_device *out_dev;
 	u16 tmp_flags;
-	int ifindex;
 
 	output->head.jump_id = NFP_FL_ACTION_OPCODE_OUTPUT;
 	output->head.len_lw = act_size >> NFP_FL_LW_SIZ;
 
-	ifindex = tcf_mirred_ifindex(action);
-	out_dev = __dev_get_by_index(dev_net(in_dev), ifindex);
+	out_dev = tcf_mirred_dev(action);
 	if (!out_dev)
 		return -EOPNOTSUPP;
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.c b/drivers/net/ethernet/netronome/nfp/nfp_asm.c
index 830f6de25f47..d3610987fb07 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_asm.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.c
@@ -41,6 +41,7 @@
 
 const struct cmd_tgt_act cmd_tgt_act[__CMD_TGT_MAP_SIZE] = {
 	[CMD_TGT_WRITE8_SWAP] =		{ 0x02, 0x42 },
+	[CMD_TGT_WRITE32_SWAP] =	{ 0x02, 0x5f },
 	[CMD_TGT_READ8] =		{ 0x01, 0x43 },
 	[CMD_TGT_READ32] =		{ 0x00, 0x5c },
 	[CMD_TGT_READ32_LE] =		{ 0x01, 0x5c },
@@ -120,7 +121,8 @@ int swreg_to_unrestricted(swreg dst, swreg lreg, swreg rreg,
 	reg->dst = nfp_swreg_to_unreg(dst, true);
 
 	/* Decode source operands */
-	if (swreg_type(lreg) == swreg_type(rreg))
+	if (swreg_type(lreg) == swreg_type(rreg) &&
+	    swreg_type(lreg) != NN_REG_NONE)
 		return -EFAULT;
 
 	if (swreg_type(lreg) == NN_REG_GPR_B ||
@@ -200,7 +202,8 @@ int swreg_to_restricted(swreg dst, swreg lreg, swreg rreg,
 	reg->dst = nfp_swreg_to_rereg(dst, true, false, NULL);
 
 	/* Decode source operands */
-	if (swreg_type(lreg) == swreg_type(rreg))
+	if (swreg_type(lreg) == swreg_type(rreg) &&
+	    swreg_type(lreg) != NN_REG_NONE)
 		return -EFAULT;
 
 	if (swreg_type(lreg) == NN_REG_GPR_B ||
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.h b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
index 74d0c11ab2f9..3387e6926eb0 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_asm.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2016 Netronome Systems, Inc.
+ * Copyright (C) 2016-2017 Netronome Systems, Inc.
  *
  * This software is dual licensed under the GNU General License Version 2,
  * June 1991 as shown in the file COPYING in the top-level directory of this
@@ -209,6 +209,7 @@ enum alu_dst_ab {
 #define OP_CMD_CNT		0x0000e000000ULL
 #define OP_CMD_SIG		0x000f0000000ULL
 #define OP_CMD_TGT_CMD		0x07f00000000ULL
+#define OP_CMD_INDIR		0x20000000000ULL
 #define OP_CMD_MODE	       0x1c0000000000ULL
 
 struct cmd_tgt_act {
@@ -219,6 +220,7 @@ struct cmd_tgt_act {
 enum cmd_tgt_map {
 	CMD_TGT_READ8,
 	CMD_TGT_WRITE8_SWAP,
+	CMD_TGT_WRITE32_SWAP,
 	CMD_TGT_READ32,
 	CMD_TGT_READ32_LE,
 	CMD_TGT_READ32_SWAP,
@@ -240,6 +242,9 @@ enum cmd_ctx_swap {
 	CMD_CTX_NO_SWAP = 3,
 };
 
+#define CMD_OVE_LEN	BIT(7)
+#define CMD_OV_LEN	GENMASK(12, 8)
+
 #define OP_LCSR_BASE		0x0fc00000000ULL
 #define OP_LCSR_A_SRC		0x000000003ffULL
 #define OP_LCSR_B_SRC		0x000000ffc00ULL
@@ -257,6 +262,7 @@ enum lcsr_wr_src {
 #define OP_CARB_BASE		0x0e000000000ULL
 #define OP_CARB_OR		0x00000010000ULL
 
+#define NFP_CSR_CTX_PTR		0x20
 #define NFP_CSR_ACT_LM_ADDR0	0x64
 #define NFP_CSR_ACT_LM_ADDR1	0x6c
 #define NFP_CSR_ACT_LM_ADDR2	0x94
@@ -377,4 +383,13 @@ int swreg_to_restricted(swreg dst, swreg lreg, swreg rreg,
 int nfp_ustore_check_valid_no_ecc(u64 insn);
 u64 nfp_ustore_calc_ecc_insn(u64 insn);
 
+#define NFP_IND_ME_REFL_WR_SIG_INIT	3
+#define NFP_IND_ME_CTX_PTR_BASE_MASK	GENMASK(9, 0)
+#define NFP_IND_NUM_CONTEXTS		8
+
+static inline u32 nfp_get_ind_csr_ctx_ptr_offs(u32 read_offset)
+{
+	return (read_offset & ~NFP_IND_ME_CTX_PTR_BASE_MASK) | NFP_CSR_CTX_PTR;
+}
+
 #endif
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.c b/drivers/net/ethernet/netronome/nfp/nfp_main.c
index 35eaccbece36..0953fa8f3109 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_main.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_main.c
@@ -45,6 +45,7 @@
 #include <linux/pci.h>
 #include <linux/firmware.h>
 #include <linux/vermagic.h>
+#include <linux/vmalloc.h>
 #include <net/devlink.h>
 
 #include "nfpcore/nfp.h"
@@ -509,6 +510,9 @@ static int nfp_pci_probe(struct pci_dev *pdev,
 	pf->mip = nfp_mip_open(pf->cpp);
 	pf->rtbl = __nfp_rtsym_table_read(pf->cpp, pf->mip);
 
+	pf->dump_flag = NFP_DUMP_NSP_DIAG;
+	pf->dumpspec = nfp_net_dump_load_dumpspec(pf->cpp, pf->rtbl);
+
 	err = nfp_pcie_sriov_read_nfd_limit(pf);
 	if (err)
 		goto err_fw_unload;
@@ -544,6 +548,7 @@ err_fw_unload:
 		nfp_fw_unload(pf);
 	kfree(pf->eth_tbl);
 	kfree(pf->nspi);
+	vfree(pf->dumpspec);
 err_devlink_unreg:
 	devlink_unregister(devlink);
 err_hwinfo_free:
@@ -579,6 +584,7 @@ static void nfp_pci_remove(struct pci_dev *pdev)
 
 	devlink_unregister(devlink);
 
+	vfree(pf->dumpspec);
 	kfree(pf->rtbl);
 	nfp_mip_close(pf->mip);
 	if (pf->fw_loaded)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.h b/drivers/net/ethernet/netronome/nfp/nfp_main.h
index be0ee59f2eb9..add46e28212b 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_main.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_main.h
@@ -39,6 +39,7 @@
 #ifndef NFP_MAIN_H
 #define NFP_MAIN_H
 
+#include <linux/ethtool.h>
 #include <linux/list.h>
 #include <linux/types.h>
 #include <linux/msi.h>
@@ -62,6 +63,17 @@ struct nfp_port;
 struct nfp_rtsym_table;
 
 /**
+ * struct nfp_dumpspec - NFP FW dump specification structure
+ * @size:	Size of the data
+ * @data:	Sequence of TLVs, each being an instruction to dump some data
+ *		from FW
+ */
+struct nfp_dumpspec {
+	u32 size;
+	u8 data[0];
+};
+
+/**
  * struct nfp_pf - NFP PF-specific device structure
  * @pdev:		Backpointer to PCI device
  * @cpp:		Pointer to the CPP handle
@@ -83,6 +95,9 @@ struct nfp_rtsym_table;
  * @mip:		MIP handle
  * @rtbl:		RTsym table
  * @hwinfo:		HWInfo table
+ * @dumpspec:		Debug dump specification
+ * @dump_flag:		Store dump flag between set_dump and get_dump_flag
+ * @dump_len:		Store dump length between set_dump and get_dump_flag
  * @eth_tbl:		NSP ETH table
  * @nspi:		NSP identification info
  * @hwmon_dev:		pointer to hwmon device
@@ -124,6 +139,9 @@ struct nfp_pf {
 	const struct nfp_mip *mip;
 	struct nfp_rtsym_table *rtbl;
 	struct nfp_hwinfo *hwinfo;
+	struct nfp_dumpspec *dumpspec;
+	u32 dump_flag;
+	u32 dump_len;
 	struct nfp_eth_table *eth_tbl;
 	struct nfp_nsp_identify *nspi;
 
@@ -157,4 +175,15 @@ void nfp_net_get_mac_addr(struct nfp_pf *pf, struct nfp_port *port);
 
 bool nfp_ctrl_tx(struct nfp_net *nn, struct sk_buff *skb);
 
+enum nfp_dump_diag {
+	NFP_DUMP_NSP_DIAG = 0,
+};
+
+struct nfp_dumpspec *
+nfp_net_dump_load_dumpspec(struct nfp_cpp *cpp, struct nfp_rtsym_table *rtbl);
+s64 nfp_net_dump_calculate_size(struct nfp_pf *pf, struct nfp_dumpspec *spec,
+				u32 flag);
+int nfp_net_dump_populate_buffer(struct nfp_pf *pf, struct nfp_dumpspec *spec,
+				 struct ethtool_dump *dump_param, void *dest);
+
 #endif /* NFP_MAIN_H */
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 7f9857c276b1..3801c52098d5 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -548,6 +548,8 @@ struct nfp_net_dp {
  * @max_r_vecs:		Number of allocated interrupt vectors for RX/TX
  * @max_tx_rings:       Maximum number of TX rings supported by the Firmware
  * @max_rx_rings:       Maximum number of RX rings supported by the Firmware
+ * @stride_rx:		Queue controller RX queue spacing
+ * @stride_tx:		Queue controller TX queue spacing
  * @r_vecs:             Pre-allocated array of ring vectors
  * @irq_entries:        Pre-allocated array of MSI-X entries
  * @lsc_handler:        Handler for Link State Change interrupt
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 1a603fdd9e80..ad3e9f6a61e5 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -3392,6 +3392,7 @@ static int nfp_net_xdp(struct net_device *netdev, struct netdev_bpf *xdp)
 		if (nn->dp.bpf_offload_xdp)
 			xdp->prog_attached = XDP_ATTACHED_HW;
 		xdp->prog_id = nn->xdp_prog ? nn->xdp_prog->aux->id : 0;
+		xdp->flags = nn->xdp_prog ? nn->xdp_flags : 0;
 		return 0;
 	case BPF_OFFLOAD_VERIFIER_PREP:
 		return nfp_app_bpf_verifier_prep(nn->app, nn, xdp);
@@ -3561,9 +3562,6 @@ struct nfp_net *nfp_net_alloc(struct pci_dev *pdev, bool needs_netdev,
  */
 void nfp_net_free(struct nfp_net *nn)
 {
-	if (nn->xdp_prog)
-		bpf_prog_put(nn->xdp_prog);
-
 	if (nn->dp.netdev)
 		free_netdev(nn->dp.netdev);
 	else
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c b/drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c
new file mode 100644
index 000000000000..cb74602f0907
--- /dev/null
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_debugdump.c
@@ -0,0 +1,787 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is dual licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree or the BSD 2-Clause License provided below.  You have the
+ * option to license this software under the complete terms of either license.
+ *
+ * The BSD 2-Clause License:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      1. Redistributions of source code must retain the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer.
+ *
+ *      2. Redistributions in binary form must reproduce the above
+ *         copyright notice, this list of conditions and the following
+ *         disclaimer in the documentation and/or other materials
+ *         provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/ethtool.h>
+#include <linux/vmalloc.h>
+
+#include "nfp_asm.h"
+#include "nfp_main.h"
+#include "nfpcore/nfp.h"
+#include "nfpcore/nfp_nffw.h"
+
+#define NFP_DUMP_SPEC_RTSYM	"_abi_dump_spec"
+
+#define ALIGN8(x)	ALIGN(x, 8)
+
+enum nfp_dumpspec_type {
+	NFP_DUMPSPEC_TYPE_CPP_CSR = 0,
+	NFP_DUMPSPEC_TYPE_XPB_CSR = 1,
+	NFP_DUMPSPEC_TYPE_ME_CSR = 2,
+	NFP_DUMPSPEC_TYPE_INDIRECT_ME_CSR = 3,
+	NFP_DUMPSPEC_TYPE_RTSYM = 4,
+	NFP_DUMPSPEC_TYPE_HWINFO = 5,
+	NFP_DUMPSPEC_TYPE_FWNAME = 6,
+	NFP_DUMPSPEC_TYPE_HWINFO_FIELD = 7,
+	NFP_DUMPSPEC_TYPE_PROLOG = 10000,
+	NFP_DUMPSPEC_TYPE_ERROR = 10001,
+};
+
+/* The following structs must be carefully aligned so that they can be used to
+ * interpret the binary dumpspec and populate the dump data in a deterministic
+ * way.
+ */
+
+/* generic type plus length */
+struct nfp_dump_tl {
+	__be32 type;
+	__be32 length;	/* chunk length to follow, aligned to 8 bytes */
+	char data[0];
+};
+
+/* NFP CPP parameters */
+struct nfp_dumpspec_cpp_isl_id {
+	u8 target;
+	u8 action;
+	u8 token;
+	u8 island;
+};
+
+struct nfp_dump_common_cpp {
+	struct nfp_dumpspec_cpp_isl_id cpp_id;
+	__be32 offset;		/* address to start dump */
+	__be32 dump_length;	/* total bytes to dump, aligned to reg size */
+};
+
+/* CSR dumpables */
+struct nfp_dumpspec_csr {
+	struct nfp_dump_tl tl;
+	struct nfp_dump_common_cpp cpp;
+	__be32 register_width;	/* in bits */
+};
+
+struct nfp_dumpspec_rtsym {
+	struct nfp_dump_tl tl;
+	char rtsym[0];
+};
+
+/* header for register dumpable */
+struct nfp_dump_csr {
+	struct nfp_dump_tl tl;
+	struct nfp_dump_common_cpp cpp;
+	__be32 register_width;	/* in bits */
+	__be32 error;		/* error code encountered while reading */
+	__be32 error_offset;	/* offset being read when error occurred */
+};
+
+struct nfp_dump_rtsym {
+	struct nfp_dump_tl tl;
+	struct nfp_dump_common_cpp cpp;
+	__be32 error;		/* error code encountered while reading */
+	u8 padded_name_length;	/* pad so data starts at 8 byte boundary */
+	char rtsym[0];
+	/* after padded_name_length, there is dump_length data */
+};
+
+struct nfp_dump_prolog {
+	struct nfp_dump_tl tl;
+	__be32 dump_level;
+};
+
+struct nfp_dump_error {
+	struct nfp_dump_tl tl;
+	__be32 error;
+	char padding[4];
+	char spec[0];
+};
+
+/* to track state through debug size calculation TLV traversal */
+struct nfp_level_size {
+	u32 requested_level;	/* input */
+	u32 total_size;		/* output */
+};
+
+/* to track state during debug dump creation TLV traversal */
+struct nfp_dump_state {
+	u32 requested_level;	/* input param */
+	u32 dumped_size;	/* adds up to size of dumped data */
+	u32 buf_size;		/* size of buffer pointer to by p */
+	void *p;		/* current point in dump buffer */
+};
+
+typedef int (*nfp_tlv_visit)(struct nfp_pf *pf, struct nfp_dump_tl *tl,
+			     void *param);
+
+static int
+nfp_traverse_tlvs(struct nfp_pf *pf, void *data, u32 data_length, void *param,
+		  nfp_tlv_visit tlv_visit)
+{
+	long long remaining = data_length;
+	struct nfp_dump_tl *tl;
+	u32 total_tlv_size;
+	void *p = data;
+	int err;
+
+	while (remaining >= sizeof(*tl)) {
+		tl = p;
+		if (!tl->type && !tl->length)
+			break;
+
+		if (be32_to_cpu(tl->length) > remaining - sizeof(*tl))
+			return -EINVAL;
+
+		total_tlv_size = sizeof(*tl) + be32_to_cpu(tl->length);
+
+		/* Spec TLVs should be aligned to 4 bytes. */
+		if (total_tlv_size % 4 != 0)
+			return -EINVAL;
+
+		p += total_tlv_size;
+		remaining -= total_tlv_size;
+		err = tlv_visit(pf, tl, param);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static u32 nfp_get_numeric_cpp_id(struct nfp_dumpspec_cpp_isl_id *cpp_id)
+{
+	return NFP_CPP_ISLAND_ID(cpp_id->target, cpp_id->action, cpp_id->token,
+				 cpp_id->island);
+}
+
+struct nfp_dumpspec *
+nfp_net_dump_load_dumpspec(struct nfp_cpp *cpp, struct nfp_rtsym_table *rtbl)
+{
+	const struct nfp_rtsym *specsym;
+	struct nfp_dumpspec *dumpspec;
+	int bytes_read;
+	u32 cpp_id;
+
+	specsym = nfp_rtsym_lookup(rtbl, NFP_DUMP_SPEC_RTSYM);
+	if (!specsym)
+		return NULL;
+
+	/* expected size of this buffer is in the order of tens of kilobytes */
+	dumpspec = vmalloc(sizeof(*dumpspec) + specsym->size);
+	if (!dumpspec)
+		return NULL;
+
+	dumpspec->size = specsym->size;
+
+	cpp_id = NFP_CPP_ISLAND_ID(specsym->target, NFP_CPP_ACTION_RW, 0,
+				   specsym->domain);
+
+	bytes_read = nfp_cpp_read(cpp, cpp_id, specsym->addr, dumpspec->data,
+				  specsym->size);
+	if (bytes_read != specsym->size) {
+		vfree(dumpspec);
+		nfp_warn(cpp, "Debug dump specification read failed.\n");
+		return NULL;
+	}
+
+	return dumpspec;
+}
+
+static int nfp_dump_error_tlv_size(struct nfp_dump_tl *spec)
+{
+	return ALIGN8(sizeof(struct nfp_dump_error) + sizeof(*spec) +
+		      be32_to_cpu(spec->length));
+}
+
+static int nfp_calc_fwname_tlv_size(struct nfp_pf *pf)
+{
+	u32 fwname_len = strlen(nfp_mip_name(pf->mip));
+
+	return sizeof(struct nfp_dump_tl) + ALIGN8(fwname_len + 1);
+}
+
+static int nfp_calc_hwinfo_field_sz(struct nfp_pf *pf, struct nfp_dump_tl *spec)
+{
+	u32 tl_len, key_len;
+	const char *value;
+
+	tl_len = be32_to_cpu(spec->length);
+	key_len = strnlen(spec->data, tl_len);
+	if (key_len == tl_len)
+		return nfp_dump_error_tlv_size(spec);
+
+	value = nfp_hwinfo_lookup(pf->hwinfo, spec->data);
+	if (!value)
+		return nfp_dump_error_tlv_size(spec);
+
+	return sizeof(struct nfp_dump_tl) + ALIGN8(key_len + strlen(value) + 2);
+}
+
+static bool nfp_csr_spec_valid(struct nfp_dumpspec_csr *spec_csr)
+{
+	u32 required_read_sz = sizeof(*spec_csr) - sizeof(spec_csr->tl);
+	u32 available_sz = be32_to_cpu(spec_csr->tl.length);
+	u32 reg_width;
+
+	if (available_sz < required_read_sz)
+		return false;
+
+	reg_width = be32_to_cpu(spec_csr->register_width);
+
+	return reg_width == 32 || reg_width == 64;
+}
+
+static int
+nfp_calc_rtsym_dump_sz(struct nfp_pf *pf, struct nfp_dump_tl *spec)
+{
+	struct nfp_rtsym_table *rtbl = pf->rtbl;
+	struct nfp_dumpspec_rtsym *spec_rtsym;
+	const struct nfp_rtsym *sym;
+	u32 tl_len, key_len;
+
+	spec_rtsym = (struct nfp_dumpspec_rtsym *)spec;
+	tl_len = be32_to_cpu(spec->length);
+	key_len = strnlen(spec_rtsym->rtsym, tl_len);
+	if (key_len == tl_len)
+		return nfp_dump_error_tlv_size(spec);
+
+	sym = nfp_rtsym_lookup(rtbl, spec_rtsym->rtsym);
+	if (!sym)
+		return nfp_dump_error_tlv_size(spec);
+
+	return ALIGN8(offsetof(struct nfp_dump_rtsym, rtsym) + key_len + 1) +
+	       ALIGN8(sym->size);
+}
+
+static int
+nfp_add_tlv_size(struct nfp_pf *pf, struct nfp_dump_tl *tl, void *param)
+{
+	struct nfp_dumpspec_csr *spec_csr;
+	u32 *size = param;
+	u32 hwinfo_size;
+
+	switch (be32_to_cpu(tl->type)) {
+	case NFP_DUMPSPEC_TYPE_FWNAME:
+		*size += nfp_calc_fwname_tlv_size(pf);
+		break;
+	case NFP_DUMPSPEC_TYPE_CPP_CSR:
+	case NFP_DUMPSPEC_TYPE_XPB_CSR:
+	case NFP_DUMPSPEC_TYPE_ME_CSR:
+		spec_csr = (struct nfp_dumpspec_csr *)tl;
+		if (!nfp_csr_spec_valid(spec_csr))
+			*size += nfp_dump_error_tlv_size(tl);
+		else
+			*size += ALIGN8(sizeof(struct nfp_dump_csr)) +
+				 ALIGN8(be32_to_cpu(spec_csr->cpp.dump_length));
+		break;
+	case NFP_DUMPSPEC_TYPE_INDIRECT_ME_CSR:
+		spec_csr = (struct nfp_dumpspec_csr *)tl;
+		if (!nfp_csr_spec_valid(spec_csr))
+			*size += nfp_dump_error_tlv_size(tl);
+		else
+			*size += ALIGN8(sizeof(struct nfp_dump_csr)) +
+				 ALIGN8(be32_to_cpu(spec_csr->cpp.dump_length) *
+					NFP_IND_NUM_CONTEXTS);
+		break;
+	case NFP_DUMPSPEC_TYPE_RTSYM:
+		*size += nfp_calc_rtsym_dump_sz(pf, tl);
+		break;
+	case NFP_DUMPSPEC_TYPE_HWINFO:
+		hwinfo_size = nfp_hwinfo_get_packed_str_size(pf->hwinfo);
+		*size += sizeof(struct nfp_dump_tl) + ALIGN8(hwinfo_size);
+		break;
+	case NFP_DUMPSPEC_TYPE_HWINFO_FIELD:
+		*size += nfp_calc_hwinfo_field_sz(pf, tl);
+		break;
+	default:
+		*size += nfp_dump_error_tlv_size(tl);
+		break;
+	}
+
+	return 0;
+}
+
+static int
+nfp_calc_specific_level_size(struct nfp_pf *pf, struct nfp_dump_tl *dump_level,
+			     void *param)
+{
+	struct nfp_level_size *lev_sz = param;
+
+	if (be32_to_cpu(dump_level->type) != lev_sz->requested_level)
+		return 0;
+
+	return nfp_traverse_tlvs(pf, dump_level->data,
+				 be32_to_cpu(dump_level->length),
+				 &lev_sz->total_size, nfp_add_tlv_size);
+}
+
+s64 nfp_net_dump_calculate_size(struct nfp_pf *pf, struct nfp_dumpspec *spec,
+				u32 flag)
+{
+	struct nfp_level_size lev_sz;
+	int err;
+
+	lev_sz.requested_level = flag;
+	lev_sz.total_size = ALIGN8(sizeof(struct nfp_dump_prolog));
+
+	err = nfp_traverse_tlvs(pf, spec->data, spec->size, &lev_sz,
+				nfp_calc_specific_level_size);
+	if (err)
+		return err;
+
+	return lev_sz.total_size;
+}
+
+static int nfp_add_tlv(u32 type, u32 total_tlv_sz, struct nfp_dump_state *dump)
+{
+	struct nfp_dump_tl *tl = dump->p;
+
+	if (total_tlv_sz > dump->buf_size)
+		return -ENOSPC;
+
+	if (dump->buf_size - total_tlv_sz < dump->dumped_size)
+		return -ENOSPC;
+
+	tl->type = cpu_to_be32(type);
+	tl->length = cpu_to_be32(total_tlv_sz - sizeof(*tl));
+
+	dump->dumped_size += total_tlv_sz;
+	dump->p += total_tlv_sz;
+
+	return 0;
+}
+
+static int
+nfp_dump_error_tlv(struct nfp_dump_tl *spec, int error,
+		   struct nfp_dump_state *dump)
+{
+	struct nfp_dump_error *dump_header = dump->p;
+	u32 total_spec_size, total_size;
+	int err;
+
+	total_spec_size = sizeof(*spec) + be32_to_cpu(spec->length);
+	total_size = ALIGN8(sizeof(*dump_header) + total_spec_size);
+
+	err = nfp_add_tlv(NFP_DUMPSPEC_TYPE_ERROR, total_size, dump);
+	if (err)
+		return err;
+
+	dump_header->error = cpu_to_be32(error);
+	memcpy(dump_header->spec, spec, total_spec_size);
+
+	return 0;
+}
+
+static int nfp_dump_fwname(struct nfp_pf *pf, struct nfp_dump_state *dump)
+{
+	struct nfp_dump_tl *dump_header = dump->p;
+	u32 fwname_len, total_size;
+	const char *fwname;
+	int err;
+
+	fwname = nfp_mip_name(pf->mip);
+	fwname_len = strlen(fwname);
+	total_size = sizeof(*dump_header) + ALIGN8(fwname_len + 1);
+
+	err = nfp_add_tlv(NFP_DUMPSPEC_TYPE_FWNAME, total_size, dump);
+	if (err)
+		return err;
+
+	memcpy(dump_header->data, fwname, fwname_len);
+
+	return 0;
+}
+
+static int
+nfp_dump_hwinfo(struct nfp_pf *pf, struct nfp_dump_tl *spec,
+		struct nfp_dump_state *dump)
+{
+	struct nfp_dump_tl *dump_header = dump->p;
+	u32 hwinfo_size, total_size;
+	char *hwinfo;
+	int err;
+
+	hwinfo = nfp_hwinfo_get_packed_strings(pf->hwinfo);
+	hwinfo_size = nfp_hwinfo_get_packed_str_size(pf->hwinfo);
+	total_size = sizeof(*dump_header) + ALIGN8(hwinfo_size);
+
+	err = nfp_add_tlv(NFP_DUMPSPEC_TYPE_HWINFO, total_size, dump);
+	if (err)
+		return err;
+
+	memcpy(dump_header->data, hwinfo, hwinfo_size);
+
+	return 0;
+}
+
+static int nfp_dump_hwinfo_field(struct nfp_pf *pf, struct nfp_dump_tl *spec,
+				 struct nfp_dump_state *dump)
+{
+	struct nfp_dump_tl *dump_header = dump->p;
+	u32 tl_len, key_len, val_len;
+	const char *key, *value;
+	u32 total_size;
+	int err;
+
+	tl_len = be32_to_cpu(spec->length);
+	key_len = strnlen(spec->data, tl_len);
+	if (key_len == tl_len)
+		return nfp_dump_error_tlv(spec, -EINVAL, dump);
+
+	key = spec->data;
+	value = nfp_hwinfo_lookup(pf->hwinfo, key);
+	if (!value)
+		return nfp_dump_error_tlv(spec, -ENOENT, dump);
+
+	val_len = strlen(value);
+	total_size = sizeof(*dump_header) + ALIGN8(key_len + val_len + 2);
+	err = nfp_add_tlv(NFP_DUMPSPEC_TYPE_HWINFO_FIELD, total_size, dump);
+	if (err)
+		return err;
+
+	memcpy(dump_header->data, key, key_len + 1);
+	memcpy(dump_header->data + key_len + 1, value, val_len + 1);
+
+	return 0;
+}
+
+static int
+nfp_dump_csr_range(struct nfp_pf *pf, struct nfp_dumpspec_csr *spec_csr,
+		   struct nfp_dump_state *dump)
+{
+	struct nfp_dump_csr *dump_header = dump->p;
+	u32 reg_sz, header_size, total_size;
+	u32 cpp_rd_addr, max_rd_addr;
+	int bytes_read;
+	void *dest;
+	u32 cpp_id;
+	int err;
+
+	if (!nfp_csr_spec_valid(spec_csr))
+		return nfp_dump_error_tlv(&spec_csr->tl, -EINVAL, dump);
+
+	reg_sz = be32_to_cpu(spec_csr->register_width) / BITS_PER_BYTE;
+	header_size = ALIGN8(sizeof(*dump_header));
+	total_size = header_size +
+		     ALIGN8(be32_to_cpu(spec_csr->cpp.dump_length));
+	dest = dump->p + header_size;
+
+	err = nfp_add_tlv(be32_to_cpu(spec_csr->tl.type), total_size, dump);
+	if (err)
+		return err;
+
+	dump_header->cpp = spec_csr->cpp;
+	dump_header->register_width = spec_csr->register_width;
+
+	cpp_id = nfp_get_numeric_cpp_id(&spec_csr->cpp.cpp_id);
+	cpp_rd_addr = be32_to_cpu(spec_csr->cpp.offset);
+	max_rd_addr = cpp_rd_addr + be32_to_cpu(spec_csr->cpp.dump_length);
+
+	while (cpp_rd_addr < max_rd_addr) {
+		bytes_read = nfp_cpp_read(pf->cpp, cpp_id, cpp_rd_addr, dest,
+					  reg_sz);
+		if (bytes_read != reg_sz) {
+			if (bytes_read >= 0)
+				bytes_read = -EIO;
+			dump_header->error = cpu_to_be32(bytes_read);
+			dump_header->error_offset = cpu_to_be32(cpp_rd_addr);
+			break;
+		}
+		cpp_rd_addr += reg_sz;
+		dest += reg_sz;
+	}
+
+	return 0;
+}
+
+/* Write context to CSRCtxPtr, then read from it. Then the value can be read
+ * from IndCtxStatus.
+ */
+static int
+nfp_read_indirect_csr(struct nfp_cpp *cpp,
+		      struct nfp_dumpspec_cpp_isl_id cpp_params, u32 offset,
+		      u32 reg_sz, u32 context, void *dest)
+{
+	u32 csr_ctx_ptr_offs;
+	u32 cpp_id;
+	int result;
+
+	csr_ctx_ptr_offs = nfp_get_ind_csr_ctx_ptr_offs(offset);
+	cpp_id = NFP_CPP_ISLAND_ID(cpp_params.target,
+				   NFP_IND_ME_REFL_WR_SIG_INIT,
+				   cpp_params.token, cpp_params.island);
+	result = nfp_cpp_writel(cpp, cpp_id, csr_ctx_ptr_offs, context);
+	if (result != sizeof(context))
+		return result < 0 ? result : -EIO;
+
+	cpp_id = nfp_get_numeric_cpp_id(&cpp_params);
+	result = nfp_cpp_read(cpp, cpp_id, csr_ctx_ptr_offs, dest, reg_sz);
+	if (result != reg_sz)
+		return result < 0 ? result : -EIO;
+
+	result = nfp_cpp_read(cpp, cpp_id, offset, dest, reg_sz);
+	if (result != reg_sz)
+		return result < 0 ? result : -EIO;
+
+	return 0;
+}
+
+static int
+nfp_read_all_indirect_csr_ctx(struct nfp_cpp *cpp,
+			      struct nfp_dumpspec_csr *spec_csr, u32 address,
+			      u32 reg_sz, void *dest)
+{
+	u32 ctx;
+	int err;
+
+	for (ctx = 0; ctx < NFP_IND_NUM_CONTEXTS; ctx++) {
+		err = nfp_read_indirect_csr(cpp, spec_csr->cpp.cpp_id, address,
+					    reg_sz, ctx, dest + ctx * reg_sz);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int
+nfp_dump_indirect_csr_range(struct nfp_pf *pf,
+			    struct nfp_dumpspec_csr *spec_csr,
+			    struct nfp_dump_state *dump)
+{
+	struct nfp_dump_csr *dump_header = dump->p;
+	u32 reg_sz, header_size, total_size;
+	u32 cpp_rd_addr, max_rd_addr;
+	u32 reg_data_length;
+	void *dest;
+	int err;
+
+	if (!nfp_csr_spec_valid(spec_csr))
+		return nfp_dump_error_tlv(&spec_csr->tl, -EINVAL, dump);
+
+	reg_sz = be32_to_cpu(spec_csr->register_width) / BITS_PER_BYTE;
+	header_size = ALIGN8(sizeof(*dump_header));
+	reg_data_length = be32_to_cpu(spec_csr->cpp.dump_length) *
+			  NFP_IND_NUM_CONTEXTS;
+	total_size = header_size + ALIGN8(reg_data_length);
+	dest = dump->p + header_size;
+
+	err = nfp_add_tlv(be32_to_cpu(spec_csr->tl.type), total_size, dump);
+	if (err)
+		return err;
+
+	dump_header->cpp = spec_csr->cpp;
+	dump_header->register_width = spec_csr->register_width;
+
+	cpp_rd_addr = be32_to_cpu(spec_csr->cpp.offset);
+	max_rd_addr = cpp_rd_addr + be32_to_cpu(spec_csr->cpp.dump_length);
+	while (cpp_rd_addr < max_rd_addr) {
+		err = nfp_read_all_indirect_csr_ctx(pf->cpp, spec_csr,
+						    cpp_rd_addr, reg_sz, dest);
+		if (err) {
+			dump_header->error = cpu_to_be32(err);
+			dump_header->error_offset = cpu_to_be32(cpp_rd_addr);
+			break;
+		}
+		cpp_rd_addr += reg_sz;
+		dest += reg_sz * NFP_IND_NUM_CONTEXTS;
+	}
+
+	return 0;
+}
+
+static int
+nfp_dump_single_rtsym(struct nfp_pf *pf, struct nfp_dumpspec_rtsym *spec,
+		      struct nfp_dump_state *dump)
+{
+	struct nfp_dump_rtsym *dump_header = dump->p;
+	struct nfp_dumpspec_cpp_isl_id cpp_params;
+	struct nfp_rtsym_table *rtbl = pf->rtbl;
+	const struct nfp_rtsym *sym;
+	u32 header_size, total_size;
+	u32 tl_len, key_len;
+	int bytes_read;
+	u32 cpp_id;
+	void *dest;
+	int err;
+
+	tl_len = be32_to_cpu(spec->tl.length);
+	key_len = strnlen(spec->rtsym, tl_len);
+	if (key_len == tl_len)
+		return nfp_dump_error_tlv(&spec->tl, -EINVAL, dump);
+
+	sym = nfp_rtsym_lookup(rtbl, spec->rtsym);
+	if (!sym)
+		return nfp_dump_error_tlv(&spec->tl, -ENOENT, dump);
+
+	header_size =
+		ALIGN8(offsetof(struct nfp_dump_rtsym, rtsym) + key_len + 1);
+	total_size = header_size + ALIGN8(sym->size);
+	dest = dump->p + header_size;
+
+	err = nfp_add_tlv(be32_to_cpu(spec->tl.type), total_size, dump);
+	if (err)
+		return err;
+
+	dump_header->padded_name_length =
+		header_size - offsetof(struct nfp_dump_rtsym, rtsym);
+	memcpy(dump_header->rtsym, spec->rtsym, key_len + 1);
+
+	cpp_params.target = sym->target;
+	cpp_params.action = NFP_CPP_ACTION_RW;
+	cpp_params.token  = 0;
+	cpp_params.island = sym->domain;
+	cpp_id = nfp_get_numeric_cpp_id(&cpp_params);
+
+	dump_header->cpp.cpp_id = cpp_params;
+	dump_header->cpp.offset = cpu_to_be32(sym->addr);
+	dump_header->cpp.dump_length = cpu_to_be32(sym->size);
+
+	bytes_read = nfp_cpp_read(pf->cpp, cpp_id, sym->addr, dest, sym->size);
+	if (bytes_read != sym->size) {
+		if (bytes_read >= 0)
+			bytes_read = -EIO;
+		dump_header->error = cpu_to_be32(bytes_read);
+	}
+
+	return 0;
+}
+
+static int
+nfp_dump_for_tlv(struct nfp_pf *pf, struct nfp_dump_tl *tl, void *param)
+{
+	struct nfp_dumpspec_rtsym *spec_rtsym;
+	struct nfp_dump_state *dump = param;
+	struct nfp_dumpspec_csr *spec_csr;
+	int err;
+
+	switch (be32_to_cpu(tl->type)) {
+	case NFP_DUMPSPEC_TYPE_FWNAME:
+		err = nfp_dump_fwname(pf, dump);
+		if (err)
+			return err;
+		break;
+	case NFP_DUMPSPEC_TYPE_CPP_CSR:
+	case NFP_DUMPSPEC_TYPE_XPB_CSR:
+	case NFP_DUMPSPEC_TYPE_ME_CSR:
+		spec_csr = (struct nfp_dumpspec_csr *)tl;
+		err = nfp_dump_csr_range(pf, spec_csr, dump);
+		if (err)
+			return err;
+		break;
+	case NFP_DUMPSPEC_TYPE_INDIRECT_ME_CSR:
+		spec_csr = (struct nfp_dumpspec_csr *)tl;
+		err = nfp_dump_indirect_csr_range(pf, spec_csr, dump);
+		if (err)
+			return err;
+		break;
+	case NFP_DUMPSPEC_TYPE_RTSYM:
+		spec_rtsym = (struct nfp_dumpspec_rtsym *)tl;
+		err = nfp_dump_single_rtsym(pf, spec_rtsym, dump);
+		if (err)
+			return err;
+		break;
+	case NFP_DUMPSPEC_TYPE_HWINFO:
+		err = nfp_dump_hwinfo(pf, tl, dump);
+		if (err)
+			return err;
+		break;
+	case NFP_DUMPSPEC_TYPE_HWINFO_FIELD:
+		err = nfp_dump_hwinfo_field(pf, tl, dump);
+		if (err)
+			return err;
+		break;
+	default:
+		err = nfp_dump_error_tlv(tl, -EOPNOTSUPP, dump);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int
+nfp_dump_specific_level(struct nfp_pf *pf, struct nfp_dump_tl *dump_level,
+			void *param)
+{
+	struct nfp_dump_state *dump = param;
+
+	if (be32_to_cpu(dump_level->type) != dump->requested_level)
+		return 0;
+
+	return nfp_traverse_tlvs(pf, dump_level->data,
+				 be32_to_cpu(dump_level->length), dump,
+				 nfp_dump_for_tlv);
+}
+
+static int nfp_dump_populate_prolog(struct nfp_dump_state *dump)
+{
+	struct nfp_dump_prolog *prolog = dump->p;
+	u32 total_size;
+	int err;
+
+	total_size = ALIGN8(sizeof(*prolog));
+
+	err = nfp_add_tlv(NFP_DUMPSPEC_TYPE_PROLOG, total_size, dump);
+	if (err)
+		return err;
+
+	prolog->dump_level = cpu_to_be32(dump->requested_level);
+
+	return 0;
+}
+
+int nfp_net_dump_populate_buffer(struct nfp_pf *pf, struct nfp_dumpspec *spec,
+				 struct ethtool_dump *dump_param, void *dest)
+{
+	struct nfp_dump_state dump;
+	int err;
+
+	dump.requested_level = dump_param->flag;
+	dump.dumped_size = 0;
+	dump.p = dest;
+	dump.buf_size = dump_param->len;
+
+	err = nfp_dump_populate_prolog(&dump);
+	if (err)
+		return err;
+
+	err = nfp_traverse_tlvs(pf, spec->data, spec->size, &dump,
+				nfp_dump_specific_level);
+	if (err)
+		return err;
+
+	/* Set size of actual dump, to trigger warning if different from
+	 * calculated size.
+	 */
+	dump_param->len = dump.dumped_size;
+
+	return 0;
+}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
index 2801ecd09eab..2cde0eb00ee3 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -51,14 +51,11 @@
 #include "nfpcore/nfp.h"
 #include "nfpcore/nfp_nsp.h"
 #include "nfp_app.h"
+#include "nfp_main.h"
 #include "nfp_net_ctrl.h"
 #include "nfp_net.h"
 #include "nfp_port.h"
 
-enum nfp_dump_diag {
-	NFP_DUMP_NSP_DIAG = 0,
-};
-
 struct nfp_et_stat {
 	char name[ETH_GSTRING_LEN];
 	int off;
@@ -1066,15 +1063,34 @@ exit_release:
 	return ret;
 }
 
+/* Set the dump flag/level. Calculate the dump length for flag > 0 only (new TLV
+ * based dumps), since flag 0 (default) calculates the length in
+ * nfp_app_get_dump_flag(), and we need to support triggering a level 0 dump
+ * without setting the flag first, for backward compatibility.
+ */
 static int nfp_app_set_dump(struct net_device *netdev, struct ethtool_dump *val)
 {
 	struct nfp_app *app = nfp_app_from_netdev(netdev);
+	s64 len;
 
 	if (!app)
 		return -EOPNOTSUPP;
 
-	if (val->flag != NFP_DUMP_NSP_DIAG)
-		return -EINVAL;
+	if (val->flag == NFP_DUMP_NSP_DIAG) {
+		app->pf->dump_flag = val->flag;
+		return 0;
+	}
+
+	if (!app->pf->dumpspec)
+		return -EOPNOTSUPP;
+
+	len = nfp_net_dump_calculate_size(app->pf, app->pf->dumpspec,
+					  val->flag);
+	if (len < 0)
+		return len;
+
+	app->pf->dump_flag = val->flag;
+	app->pf->dump_len = len;
 
 	return 0;
 }
@@ -1082,14 +1098,37 @@ static int nfp_app_set_dump(struct net_device *netdev, struct ethtool_dump *val)
 static int
 nfp_app_get_dump_flag(struct net_device *netdev, struct ethtool_dump *dump)
 {
-	return nfp_dump_nsp_diag(nfp_app_from_netdev(netdev), dump, NULL);
+	struct nfp_app *app = nfp_app_from_netdev(netdev);
+
+	if (!app)
+		return -EOPNOTSUPP;
+
+	if (app->pf->dump_flag == NFP_DUMP_NSP_DIAG)
+		return nfp_dump_nsp_diag(app, dump, NULL);
+
+	dump->flag = app->pf->dump_flag;
+	dump->len = app->pf->dump_len;
+
+	return 0;
 }
 
 static int
 nfp_app_get_dump_data(struct net_device *netdev, struct ethtool_dump *dump,
 		      void *buffer)
 {
-	return nfp_dump_nsp_diag(nfp_app_from_netdev(netdev), dump, buffer);
+	struct nfp_app *app = nfp_app_from_netdev(netdev);
+
+	if (!app)
+		return -EOPNOTSUPP;
+
+	if (app->pf->dump_flag == NFP_DUMP_NSP_DIAG)
+		return nfp_dump_nsp_diag(app, dump, buffer);
+
+	dump->flag = app->pf->dump_flag;
+	dump->len = app->pf->dump_len;
+
+	return nfp_net_dump_populate_buffer(app->pf, app->pf->dumpspec, dump,
+					    buffer);
 }
 
 static int nfp_net_set_coalesce(struct net_device *netdev,
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp.h b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp.h
index 3ce51f03126f..ced62d112aa2 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp.h
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp.h
@@ -49,6 +49,8 @@
 struct nfp_hwinfo;
 struct nfp_hwinfo *nfp_hwinfo_read(struct nfp_cpp *cpp);
 const char *nfp_hwinfo_lookup(struct nfp_hwinfo *hwinfo, const char *lookup);
+char *nfp_hwinfo_get_packed_strings(struct nfp_hwinfo *hwinfo);
+u32 nfp_hwinfo_get_packed_str_size(struct nfp_hwinfo *hwinfo);
 
 /* Implemented in nfp_nsp.c, low level functions */
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
index 04dd5758ecf5..3fcb522d2e85 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
@@ -372,8 +372,7 @@ nfp_cpp_area_alloc(struct nfp_cpp *cpp, u32 dest,
  * that it can be accessed directly.
  *
  * NOTE: @address and @size must be 32-bit aligned values.
- *
- * NOTE: The area must also be 'released' when the structure is freed.
+ * The area must also be 'released' when the structure is freed.
  *
  * Return: NFP CPP Area handle, or NULL
  */
@@ -536,8 +535,7 @@ void nfp_cpp_area_release_free(struct nfp_cpp_area *area)
  * Read data from indicated CPP region.
  *
  * NOTE: @offset and @length must be 32-bit aligned values.
- *
- * NOTE: Area must have been locked down with an 'acquire'.
+ * Area must have been locked down with an 'acquire'.
  *
  * Return: length of io, or -ERRNO
  */
@@ -558,8 +556,7 @@ int nfp_cpp_area_read(struct nfp_cpp_area *area,
  * Write data to indicated CPP region.
  *
  * NOTE: @offset and @length must be 32-bit aligned values.
- *
- * NOTE: Area must have been locked down with an 'acquire'.
+ * Area must have been locked down with an 'acquire'.
  *
  * Return: length of io, or -ERRNO
  */
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_hwinfo.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_hwinfo.c
index 4f24aff1e772..063a9a6243d6 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_hwinfo.c
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_hwinfo.c
@@ -302,3 +302,13 @@ const char *nfp_hwinfo_lookup(struct nfp_hwinfo *hwinfo, const char *lookup)
 
 	return NULL;
 }
+
+char *nfp_hwinfo_get_packed_strings(struct nfp_hwinfo *hwinfo)
+{
+	return hwinfo->data;
+}
+
+u32 nfp_hwinfo_get_packed_str_size(struct nfp_hwinfo *hwinfo)
+{
+	return le32_to_cpu(hwinfo->size) - sizeof(u32);
+}
diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index 481876b5424c..53614ed694fc 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -2563,7 +2563,7 @@ static int nv_tx_done(struct net_device *dev, int limit)
 
 		if (np->desc_ver == DESC_VER_1) {
 			if (flags & NV_TX_LASTPACKET) {
-				if (flags & NV_TX_ERROR) {
+				if (unlikely(flags & NV_TX_ERROR)) {
 					if ((flags & NV_TX_RETRYERROR)
 					    && !(flags & NV_TX_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
@@ -2580,7 +2580,7 @@ static int nv_tx_done(struct net_device *dev, int limit)
 			}
 		} else {
 			if (flags & NV_TX2_LASTPACKET) {
-				if (flags & NV_TX2_ERROR) {
+				if (unlikely(flags & NV_TX2_ERROR)) {
 					if ((flags & NV_TX2_RETRYERROR)
 					    && !(flags & NV_TX2_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
@@ -2626,7 +2626,7 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
 		nv_unmap_txskb(np, np->get_tx_ctx);
 
 		if (flags & NV_TX2_LASTPACKET) {
-			if (flags & NV_TX2_ERROR) {
+			if (unlikely(flags & NV_TX2_ERROR)) {
 				if ((flags & NV_TX2_RETRYERROR)
 				    && !(flags & NV_TX2_RETRYCOUNT_MASK)) {
 					if (np->driver_data & DEV_HAS_GEAR_MODE)
diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
index 0a66389c06c2..1cd39c9a0345 100644
--- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
+++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_hw.c
@@ -2502,12 +2502,10 @@ netxen_collect_minidump(struct netxen_adapter *adapter)
 {
 	int ret = 0;
 	struct netxen_minidump_template_hdr *hdr;
-	struct timespec val;
 	hdr = (struct netxen_minidump_template_hdr *)
 				adapter->mdump.md_template;
 	hdr->driver_capture_mask = adapter->mdump.md_capture_mask;
-	jiffies_to_timespec(jiffies, &val);
-	hdr->driver_timestamp = (u32) val.tv_sec;
+	hdr->driver_timestamp = ktime_get_seconds();
 	hdr->driver_info_word2 = adapter->fw_version;
 	hdr->driver_info_word3 = NXRD32(adapter, CRB_DRIVER_VERSION);
 	ret = netxen_parse_md_template(adapter);
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 8f9b3eb82137..57332b3e5e64 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -1068,10 +1068,6 @@ static void __qede_remove(struct pci_dev *pdev, enum qede_remove_mode mode)
 
 	pci_set_drvdata(pdev, NULL);
 
-	/* Release edev's reference to XDP's bpf if such exist */
-	if (edev->xdp_prog)
-		bpf_prog_put(edev->xdp_prog);
-
 	/* Use global ops since we've freed edev */
 	qed_ops->common->slowpath_stop(cdev);
 	if (system_state == SYSTEM_POWER_OFF)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index d7250539d0bd..c52a9963c19d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1997,22 +1997,60 @@ static void stmmac_set_dma_operation_mode(struct stmmac_priv *priv, u32 txmode,
 static void stmmac_dma_interrupt(struct stmmac_priv *priv)
 {
 	u32 tx_channel_count = priv->plat->tx_queues_to_use;
-	int status;
+	u32 rx_channel_count = priv->plat->rx_queues_to_use;
+	u32 channels_to_check = tx_channel_count > rx_channel_count ?
+				tx_channel_count : rx_channel_count;
 	u32 chan;
+	bool poll_scheduled = false;
+	int status[channels_to_check];
+
+	/* Each DMA channel can be used for rx and tx simultaneously, yet
+	 * napi_struct is embedded in struct stmmac_rx_queue rather than in a
+	 * stmmac_channel struct.
+	 * Because of this, stmmac_poll currently checks (and possibly wakes)
+	 * all tx queues rather than just a single tx queue.
+	 */
+	for (chan = 0; chan < channels_to_check; chan++)
+		status[chan] = priv->hw->dma->dma_interrupt(priv->ioaddr,
+							    &priv->xstats,
+							    chan);
 
-	for (chan = 0; chan < tx_channel_count; chan++) {
-		struct stmmac_rx_queue *rx_q = &priv->rx_queue[chan];
+	for (chan = 0; chan < rx_channel_count; chan++) {
+		if (likely(status[chan] & handle_rx)) {
+			struct stmmac_rx_queue *rx_q = &priv->rx_queue[chan];
 
-		status = priv->hw->dma->dma_interrupt(priv->ioaddr,
-						      &priv->xstats, chan);
-		if (likely((status & handle_rx)) || (status & handle_tx)) {
 			if (likely(napi_schedule_prep(&rx_q->napi))) {
 				stmmac_disable_dma_irq(priv, chan);
 				__napi_schedule(&rx_q->napi);
+				poll_scheduled = true;
+			}
+		}
+	}
+
+	/* If we scheduled poll, we already know that tx queues will be checked.
+	 * If we didn't schedule poll, see if any DMA channel (used by tx) has a
+	 * completed transmission, if so, call stmmac_poll (once).
+	 */
+	if (!poll_scheduled) {
+		for (chan = 0; chan < tx_channel_count; chan++) {
+			if (status[chan] & handle_tx) {
+				/* It doesn't matter what rx queue we choose
+				 * here. We use 0 since it always exists.
+				 */
+				struct stmmac_rx_queue *rx_q =
+					&priv->rx_queue[0];
+
+				if (likely(napi_schedule_prep(&rx_q->napi))) {
+					stmmac_disable_dma_irq(priv, chan);
+					__napi_schedule(&rx_q->napi);
+				}
+				break;
 			}
 		}
+	}
 
-		if (unlikely(status & tx_hard_error_bump_tc)) {
+	for (chan = 0; chan < tx_channel_count; chan++) {
+		if (unlikely(status[chan] & tx_hard_error_bump_tc)) {
 			/* Try to bump up the dma threshold on this failure */
 			if (unlikely(priv->xstats.threshold != SF_DMA_MODE) &&
 			    (tc <= 256)) {
@@ -2029,7 +2067,7 @@ static void stmmac_dma_interrupt(struct stmmac_priv *priv)
 								    chan);
 				priv->xstats.threshold = tc;
 			}
-		} else if (unlikely(status == tx_hard_error)) {
+		} else if (unlikely(status[chan] == tx_hard_error)) {
 			stmmac_tx_err(priv, chan);
 		}
 	}
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index a73600dceb8b..a60a378b8b29 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -88,6 +88,7 @@ do {								\
 #define CPSW_VERSION_4		0x190112
 
 #define HOST_PORT_NUM		0
+#define CPSW_ALE_PORTS_NUM	3
 #define SLIVER_SIZE		0x40
 
 #define CPSW1_HOST_PORT_OFFSET	0x028
@@ -352,6 +353,27 @@ struct cpsw_hw_stats {
 	u32	rxdmaoverruns;
 };
 
+struct cpsw_slave_data {
+	struct device_node *phy_node;
+	char		phy_id[MII_BUS_ID_SIZE];
+	int		phy_if;
+	u8		mac_addr[ETH_ALEN];
+	u16		dual_emac_res_vlan;	/* Reserved VLAN for DualEMAC */
+};
+
+struct cpsw_platform_data {
+	struct cpsw_slave_data	*slave_data;
+	u32	ss_reg_ofs;	/* Subsystem control register offset */
+	u32	channels;	/* number of cpdma channels (symmetric) */
+	u32	slaves;		/* number of slave cpgmac ports */
+	u32	active_slave; /* time stamping, ethtool and SIOCGMIIPHY slave */
+	u32	ale_entries;	/* ale table size */
+	u32	bd_ram_size;  /*buffer descriptor ram size */
+	u32	mac_control;	/* Mac control register */
+	u16	default_vlan;	/* Def VLAN for ALE lookup in VLAN aware mode*/
+	bool	dual_emac;	/* Enable Dual EMAC mode */
+};
+
 struct cpsw_slave {
 	void __iomem			*regs;
 	struct cpsw_sliver_regs __iomem	*sliver;
@@ -365,12 +387,12 @@ struct cpsw_slave {
 
 static inline u32 slave_read(struct cpsw_slave *slave, u32 offset)
 {
-	return __raw_readl(slave->regs + offset);
+	return readl_relaxed(slave->regs + offset);
 }
 
 static inline void slave_write(struct cpsw_slave *slave, u32 val, u32 offset)
 {
-	__raw_writel(val, slave->regs + offset);
+	writel_relaxed(val, slave->regs + offset);
 }
 
 struct cpsw_vector {
@@ -660,8 +682,8 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
 
 static void cpsw_intr_enable(struct cpsw_common *cpsw)
 {
-	__raw_writel(0xFF, &cpsw->wr_regs->tx_en);
-	__raw_writel(0xFF, &cpsw->wr_regs->rx_en);
+	writel_relaxed(0xFF, &cpsw->wr_regs->tx_en);
+	writel_relaxed(0xFF, &cpsw->wr_regs->rx_en);
 
 	cpdma_ctlr_int_ctrl(cpsw->dma, true);
 	return;
@@ -669,8 +691,8 @@ static void cpsw_intr_enable(struct cpsw_common *cpsw)
 
 static void cpsw_intr_disable(struct cpsw_common *cpsw)
 {
-	__raw_writel(0, &cpsw->wr_regs->tx_en);
-	__raw_writel(0, &cpsw->wr_regs->rx_en);
+	writel_relaxed(0, &cpsw->wr_regs->tx_en);
+	writel_relaxed(0, &cpsw->wr_regs->rx_en);
 
 	cpdma_ctlr_int_ctrl(cpsw->dma, false);
 	return;
@@ -949,18 +971,14 @@ static inline void soft_reset(const char *module, void __iomem *reg)
 {
 	unsigned long timeout = jiffies + HZ;
 
-	__raw_writel(1, reg);
+	writel_relaxed(1, reg);
 	do {
 		cpu_relax();
-	} while ((__raw_readl(reg) & 1) && time_after(timeout, jiffies));
+	} while ((readl_relaxed(reg) & 1) && time_after(timeout, jiffies));
 
-	WARN(__raw_readl(reg) & 1, "failed to soft-reset %s\n", module);
+	WARN(readl_relaxed(reg) & 1, "failed to soft-reset %s\n", module);
 }
 
-#define mac_hi(mac)	(((mac)[0] << 0) | ((mac)[1] << 8) |	\
-			 ((mac)[2] << 16) | ((mac)[3] << 24))
-#define mac_lo(mac)	(((mac)[4] << 0) | ((mac)[5] << 8))
-
 static void cpsw_set_slave_mac(struct cpsw_slave *slave,
 			       struct cpsw_priv *priv)
 {
@@ -1015,7 +1033,7 @@ static void _cpsw_adjust_link(struct cpsw_slave *slave,
 
 	if (mac_control != slave->mac_control) {
 		phy_print_status(phy);
-		__raw_writel(mac_control, &slave->sliver->mac_control);
+		writel_relaxed(mac_control, &slave->sliver->mac_control);
 	}
 
 	slave->mac_control = mac_control;
@@ -1278,7 +1296,7 @@ static void cpsw_slave_open(struct cpsw_slave *slave, struct cpsw_priv *priv)
 	soft_reset_slave(slave);
 
 	/* setup priority mapping */
-	__raw_writel(RX_PRIORITY_MAPPING, &slave->sliver->rx_pri_map);
+	writel_relaxed(RX_PRIORITY_MAPPING, &slave->sliver->rx_pri_map);
 
 	switch (cpsw->version) {
 	case CPSW_VERSION_1:
@@ -1304,7 +1322,7 @@ static void cpsw_slave_open(struct cpsw_slave *slave, struct cpsw_priv *priv)
 	}
 
 	/* setup max packet size, and mac address */
-	__raw_writel(cpsw->rx_packet_max, &slave->sliver->rx_maxlen);
+	writel_relaxed(cpsw->rx_packet_max, &slave->sliver->rx_maxlen);
 	cpsw_set_slave_mac(slave, priv);
 
 	slave->mac_control = 0;	/* no link yet */
@@ -1395,9 +1413,9 @@ static void cpsw_init_host_port(struct cpsw_priv *priv)
 	writel(fifo_mode, &cpsw->host_port_regs->tx_in_ctl);
 
 	/* setup host port priority mapping */
-	__raw_writel(CPDMA_TX_PRIORITY_MAP,
-		     &cpsw->host_port_regs->cpdma_tx_pri_map);
-	__raw_writel(0, &cpsw->host_port_regs->cpdma_rx_chan_map);
+	writel_relaxed(CPDMA_TX_PRIORITY_MAP,
+		       &cpsw->host_port_regs->cpdma_tx_pri_map);
+	writel_relaxed(0, &cpsw->host_port_regs->cpdma_rx_chan_map);
 
 	cpsw_ale_control_set(cpsw->ale, HOST_PORT_NUM,
 			     ALE_PORT_STATE, ALE_PORT_STATE_FORWARD);
@@ -1514,10 +1532,10 @@ static int cpsw_ndo_open(struct net_device *ndev)
 	/* initialize shared resources for every ndev */
 	if (!cpsw->usage_count) {
 		/* disable priority elevation */
-		__raw_writel(0, &cpsw->regs->ptype);
+		writel_relaxed(0, &cpsw->regs->ptype);
 
 		/* enable statistics collection only on all ports */
-		__raw_writel(0x7, &cpsw->regs->stat_port_en);
+		writel_relaxed(0x7, &cpsw->regs->stat_port_en);
 
 		/* Enable internal fifo flow control */
 		writel(0x7, &cpsw->regs->flow_control);
@@ -1701,7 +1719,7 @@ static void cpsw_hwtstamp_v2(struct cpsw_priv *priv)
 
 	slave_write(slave, mtype, CPSW2_TS_SEQ_MTYPE);
 	slave_write(slave, ctrl, CPSW2_CONTROL);
-	__raw_writel(ETH_P_1588, &cpsw->regs->ts_ltype);
+	writel_relaxed(ETH_P_1588, &cpsw->regs->ts_ltype);
 }
 
 static int cpsw_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
@@ -2298,7 +2316,6 @@ static int cpsw_check_ch_settings(struct cpsw_common *cpsw,
 
 static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx)
 {
-	int (*poll)(struct napi_struct *, int);
 	struct cpsw_common *cpsw = priv->cpsw;
 	void (*handler)(void *, int, int);
 	struct netdev_queue *queue;
@@ -2309,12 +2326,10 @@ static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx)
 		ch = &cpsw->rx_ch_num;
 		vec = cpsw->rxv;
 		handler = cpsw_rx_handler;
-		poll = cpsw_rx_poll;
 	} else {
 		ch = &cpsw->tx_ch_num;
 		vec = cpsw->txv;
 		handler = cpsw_tx_handler;
-		poll = cpsw_tx_poll;
 	}
 
 	while (*ch < ch_num) {
@@ -3060,7 +3075,7 @@ static int cpsw_probe(struct platform_device *pdev)
 	ale_params.dev			= &pdev->dev;
 	ale_params.ale_ageout		= ale_ageout;
 	ale_params.ale_entries		= data->ale_entries;
-	ale_params.ale_ports		= data->slaves;
+	ale_params.ale_ports		= CPSW_ALE_PORTS_NUM;
 
 	cpsw->ale = cpsw_ale_create(&ale_params);
 	if (!cpsw->ale) {
@@ -3072,14 +3087,14 @@ static int cpsw_probe(struct platform_device *pdev)
 	cpsw->cpts = cpts_create(cpsw->dev, cpts_regs, cpsw->dev->of_node);
 	if (IS_ERR(cpsw->cpts)) {
 		ret = PTR_ERR(cpsw->cpts);
-		goto clean_ale_ret;
+		goto clean_dma_ret;
 	}
 
 	ndev->irq = platform_get_irq(pdev, 1);
 	if (ndev->irq < 0) {
 		dev_err(priv->dev, "error getting irq resource\n");
 		ret = ndev->irq;
-		goto clean_ale_ret;
+		goto clean_dma_ret;
 	}
 
 	of_id = of_match_device(cpsw_of_mtable, &pdev->dev);
@@ -3103,7 +3118,7 @@ static int cpsw_probe(struct platform_device *pdev)
 	if (ret) {
 		dev_err(priv->dev, "error registering net device\n");
 		ret = -ENODEV;
-		goto clean_ale_ret;
+		goto clean_dma_ret;
 	}
 
 	if (cpsw->data.dual_emac) {
@@ -3126,7 +3141,7 @@ static int cpsw_probe(struct platform_device *pdev)
 	irq = platform_get_irq(pdev, 1);
 	if (irq < 0) {
 		ret = irq;
-		goto clean_ale_ret;
+		goto clean_dma_ret;
 	}
 
 	cpsw->irqs_table[0] = irq;
@@ -3134,14 +3149,14 @@ static int cpsw_probe(struct platform_device *pdev)
 			       0, dev_name(&pdev->dev), cpsw);
 	if (ret < 0) {
 		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
-		goto clean_ale_ret;
+		goto clean_dma_ret;
 	}
 
 	/* TX IRQ */
 	irq = platform_get_irq(pdev, 2);
 	if (irq < 0) {
 		ret = irq;
-		goto clean_ale_ret;
+		goto clean_dma_ret;
 	}
 
 	cpsw->irqs_table[1] = irq;
@@ -3149,7 +3164,7 @@ static int cpsw_probe(struct platform_device *pdev)
 			       0, dev_name(&pdev->dev), cpsw);
 	if (ret < 0) {
 		dev_err(priv->dev, "error attaching irq (%d)\n", ret);
-		goto clean_ale_ret;
+		goto clean_dma_ret;
 	}
 
 	cpsw_notice(priv, probe,
@@ -3162,8 +3177,6 @@ static int cpsw_probe(struct platform_device *pdev)
 
 clean_unregister_netdev_ret:
 	unregister_netdev(ndev);
-clean_ale_ret:
-	cpsw_ale_destroy(cpsw->ale);
 clean_dma_ret:
 	cpdma_ctlr_destroy(cpsw->dma);
 clean_dt_ret:
@@ -3193,7 +3206,6 @@ static int cpsw_remove(struct platform_device *pdev)
 	unregister_netdev(ndev);
 
 	cpts_release(cpsw->cpts);
-	cpsw_ale_destroy(cpsw->ale);
 	cpdma_ctlr_destroy(cpsw->dma);
 	cpsw_remove_dt(pdev);
 	pm_runtime_put_sync(&pdev->dev);
diff --git a/drivers/net/ethernet/ti/cpsw.h b/drivers/net/ethernet/ti/cpsw.h
index 6c3037aa2cd3..cf111db3dc27 100644
--- a/drivers/net/ethernet/ti/cpsw.h
+++ b/drivers/net/ethernet/ti/cpsw.h
@@ -17,26 +17,9 @@
 #include <linux/if_ether.h>
 #include <linux/phy.h>
 
-struct cpsw_slave_data {
-	struct device_node *phy_node;
-	char		phy_id[MII_BUS_ID_SIZE];
-	int		phy_if;
-	u8		mac_addr[ETH_ALEN];
-	u16		dual_emac_res_vlan;	/* Reserved VLAN for DualEMAC */
-};
-
-struct cpsw_platform_data {
-	struct cpsw_slave_data	*slave_data;
-	u32	ss_reg_ofs;	/* Subsystem control register offset */
-	u32	channels;	/* number of cpdma channels (symmetric) */
-	u32	slaves;		/* number of slave cpgmac ports */
-	u32	active_slave; /* time stamping, ethtool and SIOCGMIIPHY slave */
-	u32	ale_entries;	/* ale table size */
-	u32	bd_ram_size;  /*buffer descriptor ram size */
-	u32	mac_control;	/* Mac control register */
-	u16	default_vlan;	/* Def VLAN for ALE lookup in VLAN aware mode*/
-	bool	dual_emac;	/* Enable Dual EMAC mode */
-};
+#define mac_hi(mac)	(((mac)[0] << 0) | ((mac)[1] << 8) |	\
+			 ((mac)[2] << 16) | ((mac)[3] << 24))
+#define mac_lo(mac)	(((mac)[4] << 0) | ((mac)[5] << 8))
 
 void cpsw_phy_sel(struct device *dev, phy_interface_t phy_mode, int slave);
 int ti_cm_get_macid(struct device *dev, int slave, u8 *mac_addr);
diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c
index b432a75fb874..93dc05c194d3 100644
--- a/drivers/net/ethernet/ti/cpsw_ale.c
+++ b/drivers/net/ethernet/ti/cpsw_ale.c
@@ -150,11 +150,11 @@ static int cpsw_ale_read(struct cpsw_ale *ale, int idx, u32 *ale_entry)
 
 	WARN_ON(idx > ale->params.ale_entries);
 
-	__raw_writel(idx, ale->params.ale_regs + ALE_TABLE_CONTROL);
+	writel_relaxed(idx, ale->params.ale_regs + ALE_TABLE_CONTROL);
 
 	for (i = 0; i < ALE_ENTRY_WORDS; i++)
-		ale_entry[i] = __raw_readl(ale->params.ale_regs +
-					   ALE_TABLE + 4 * i);
+		ale_entry[i] = readl_relaxed(ale->params.ale_regs +
+					     ALE_TABLE + 4 * i);
 
 	return idx;
 }
@@ -166,11 +166,11 @@ static int cpsw_ale_write(struct cpsw_ale *ale, int idx, u32 *ale_entry)
 	WARN_ON(idx > ale->params.ale_entries);
 
 	for (i = 0; i < ALE_ENTRY_WORDS; i++)
-		__raw_writel(ale_entry[i], ale->params.ale_regs +
-			     ALE_TABLE + 4 * i);
+		writel_relaxed(ale_entry[i], ale->params.ale_regs +
+			       ALE_TABLE + 4 * i);
 
-	__raw_writel(idx | ALE_TABLE_WRITE, ale->params.ale_regs +
-		     ALE_TABLE_CONTROL);
+	writel_relaxed(idx | ALE_TABLE_WRITE, ale->params.ale_regs +
+		       ALE_TABLE_CONTROL);
 
 	return idx;
 }
@@ -723,7 +723,7 @@ int cpsw_ale_control_set(struct cpsw_ale *ale, int port, int control,
 	if (info->port_offset == 0 && info->port_shift == 0)
 		port = 0; /* global, port is a dont care */
 
-	if (port < 0 || port > ale->params.ale_ports)
+	if (port < 0 || port >= ale->params.ale_ports)
 		return -EINVAL;
 
 	mask = BITMASK(info->bits);
@@ -733,9 +733,9 @@ int cpsw_ale_control_set(struct cpsw_ale *ale, int port, int control,
 	offset = info->offset + (port * info->port_offset);
 	shift  = info->shift  + (port * info->port_shift);
 
-	tmp = __raw_readl(ale->params.ale_regs + offset);
+	tmp = readl_relaxed(ale->params.ale_regs + offset);
 	tmp = (tmp & ~(mask << shift)) | (value << shift);
-	__raw_writel(tmp, ale->params.ale_regs + offset);
+	writel_relaxed(tmp, ale->params.ale_regs + offset);
 
 	return 0;
 }
@@ -754,13 +754,13 @@ int cpsw_ale_control_get(struct cpsw_ale *ale, int port, int control)
 	if (info->port_offset == 0 && info->port_shift == 0)
 		port = 0; /* global, port is a dont care */
 
-	if (port < 0 || port > ale->params.ale_ports)
+	if (port < 0 || port >= ale->params.ale_ports)
 		return -EINVAL;
 
 	offset = info->offset + (port * info->port_offset);
 	shift  = info->shift  + (port * info->port_shift);
 
-	tmp = __raw_readl(ale->params.ale_regs + offset) >> shift;
+	tmp = readl_relaxed(ale->params.ale_regs + offset) >> shift;
 	return tmp & BITMASK(info->bits);
 }
 EXPORT_SYMBOL_GPL(cpsw_ale_control_get);
@@ -779,9 +779,37 @@ static void cpsw_ale_timer(struct timer_list *t)
 
 void cpsw_ale_start(struct cpsw_ale *ale)
 {
+	cpsw_ale_control_set(ale, 0, ALE_ENABLE, 1);
+	cpsw_ale_control_set(ale, 0, ALE_CLEAR, 1);
+
+	timer_setup(&ale->timer, cpsw_ale_timer, 0);
+	if (ale->ageout) {
+		ale->timer.expires = jiffies + ale->ageout;
+		add_timer(&ale->timer);
+	}
+}
+EXPORT_SYMBOL_GPL(cpsw_ale_start);
+
+void cpsw_ale_stop(struct cpsw_ale *ale)
+{
+	del_timer_sync(&ale->timer);
+	cpsw_ale_control_set(ale, 0, ALE_ENABLE, 0);
+}
+EXPORT_SYMBOL_GPL(cpsw_ale_stop);
+
+struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params)
+{
+	struct cpsw_ale *ale;
 	u32 rev, ale_entries;
 
-	rev = __raw_readl(ale->params.ale_regs + ALE_IDVER);
+	ale = devm_kzalloc(params->dev, sizeof(*ale), GFP_KERNEL);
+	if (!ale)
+		return NULL;
+
+	ale->params = *params;
+	ale->ageout = ale->params.ale_ageout * HZ;
+
+	rev = readl_relaxed(ale->params.ale_regs + ALE_IDVER);
 	if (!ale->params.major_ver_mask)
 		ale->params.major_ver_mask = 0xff;
 	ale->version =
@@ -793,8 +821,8 @@ void cpsw_ale_start(struct cpsw_ale *ale)
 
 	if (!ale->params.ale_entries) {
 		ale_entries =
-			__raw_readl(ale->params.ale_regs + ALE_STATUS) &
-				    ALE_STATUS_SIZE_MASK;
+			readl_relaxed(ale->params.ale_regs + ALE_STATUS) &
+			ALE_STATUS_SIZE_MASK;
 		/* ALE available on newer NetCP switches has introduced
 		 * a register, ALE_STATUS, to indicate the size of ALE
 		 * table which shows the size as a multiple of 1024 entries.
@@ -816,9 +844,9 @@ void cpsw_ale_start(struct cpsw_ale *ale)
 		 "ALE Table size %ld\n", ale->params.ale_entries);
 
 	/* set default bits for existing h/w */
-	ale->port_mask_bits = 3;
-	ale->port_num_bits = 2;
-	ale->vlan_field_bits = 3;
+	ale->port_mask_bits = ale->params.ale_ports;
+	ale->port_num_bits = order_base_2(ale->params.ale_ports);
+	ale->vlan_field_bits = ale->params.ale_ports;
 
 	/* Set defaults override for ALE on NetCP NU switch and for version
 	 * 1R3
@@ -847,57 +875,12 @@ void cpsw_ale_start(struct cpsw_ale *ale)
 		ale_controls[ALE_PORT_UNTAGGED_EGRESS].shift = 0;
 		ale_controls[ALE_PORT_UNTAGGED_EGRESS].offset =
 					ALE_UNKNOWNVLAN_FORCE_UNTAG_EGRESS;
-		ale->port_mask_bits = ale->params.ale_ports;
-		ale->port_num_bits = ale->params.ale_ports - 1;
-		ale->vlan_field_bits = ale->params.ale_ports;
-	} else if (ale->version == ALE_VERSION_1R3) {
-		ale->port_mask_bits = ale->params.ale_ports;
-		ale->port_num_bits = 3;
-		ale->vlan_field_bits = ale->params.ale_ports;
 	}
 
-	cpsw_ale_control_set(ale, 0, ALE_ENABLE, 1);
-	cpsw_ale_control_set(ale, 0, ALE_CLEAR, 1);
-
-	timer_setup(&ale->timer, cpsw_ale_timer, 0);
-	if (ale->ageout) {
-		ale->timer.expires = jiffies + ale->ageout;
-		add_timer(&ale->timer);
-	}
-}
-EXPORT_SYMBOL_GPL(cpsw_ale_start);
-
-void cpsw_ale_stop(struct cpsw_ale *ale)
-{
-	del_timer_sync(&ale->timer);
-}
-EXPORT_SYMBOL_GPL(cpsw_ale_stop);
-
-struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params)
-{
-	struct cpsw_ale *ale;
-
-	ale = kzalloc(sizeof(*ale), GFP_KERNEL);
-	if (!ale)
-		return NULL;
-
-	ale->params = *params;
-	ale->ageout = ale->params.ale_ageout * HZ;
-
 	return ale;
 }
 EXPORT_SYMBOL_GPL(cpsw_ale_create);
 
-int cpsw_ale_destroy(struct cpsw_ale *ale)
-{
-	if (!ale)
-		return -EINVAL;
-	cpsw_ale_control_set(ale, 0, ALE_ENABLE, 0);
-	kfree(ale);
-	return 0;
-}
-EXPORT_SYMBOL_GPL(cpsw_ale_destroy);
-
 void cpsw_ale_dump(struct cpsw_ale *ale, u32 *data)
 {
 	int i;
diff --git a/drivers/net/ethernet/ti/cpsw_ale.h b/drivers/net/ethernet/ti/cpsw_ale.h
index 25d24e8d0904..d4fe9016429b 100644
--- a/drivers/net/ethernet/ti/cpsw_ale.h
+++ b/drivers/net/ethernet/ti/cpsw_ale.h
@@ -100,7 +100,6 @@ enum cpsw_ale_port_state {
 #define ALE_ENTRY_WORDS	DIV_ROUND_UP(ALE_ENTRY_BITS, 32)
 
 struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params);
-int cpsw_ale_destroy(struct cpsw_ale *ale);
 
 void cpsw_ale_start(struct cpsw_ale *ale);
 void cpsw_ale_stop(struct cpsw_ale *ale);
diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index 4bb561856af5..f58c0c620356 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -1385,11 +1385,6 @@ static int emac_devioctl(struct net_device *ndev, struct ifreq *ifrq, int cmd)
 		return -EOPNOTSUPP;
 }
 
-static int match_first_device(struct device *dev, void *data)
-{
-	return !strncmp(dev_name(dev), "davinci_mdio", 12);
-}
-
 /**
  * emac_dev_open - EMAC device open
  * @ndev: The DaVinci EMAC network adapter
@@ -1489,8 +1484,8 @@ static int emac_dev_open(struct net_device *ndev)
 
 	/* use the first phy on the bus if pdata did not give us a phy id */
 	if (!phydev && !priv->phy_id) {
-		phy = bus_find_device(&mdio_bus_type, NULL, NULL,
-				      match_first_device);
+		phy = bus_find_device_by_name(&mdio_bus_type, NULL,
+					      "davinci_mdio");
 		if (phy) {
 			priv->phy_id = dev_name(phy);
 			if (!priv->phy_id || !*priv->phy_id)
diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c
index e831c49713ee..56dbc0b9fedc 100644
--- a/drivers/net/ethernet/ti/netcp_ethss.c
+++ b/drivers/net/ethernet/ti/netcp_ethss.c
@@ -27,6 +27,7 @@
 #include <linux/net_tstamp.h>
 #include <linux/ethtool.h>
 
+#include "cpsw.h"
 #include "cpsw_ale.h"
 #include "netcp.h"
 #include "cpts.h"
@@ -2047,10 +2048,6 @@ static const struct ethtool_ops keystone_ethtool_ops = {
 	.get_ts_info		= keystone_get_ts_info,
 };
 
-#define mac_hi(mac)	(((mac)[0] << 0) | ((mac)[1] << 8) |	\
-			 ((mac)[2] << 16) | ((mac)[3] << 24))
-#define mac_lo(mac)	(((mac)[4] << 0) | ((mac)[5] << 8))
-
 static void gbe_set_slave_mac(struct gbe_slave *slave,
 			      struct gbe_intf *gbe_intf)
 {
@@ -3692,7 +3689,6 @@ static int gbe_remove(struct netcp_device *netcp_device, void *inst_priv)
 	del_timer_sync(&gbe_dev->timer);
 	cpts_release(gbe_dev->cpts);
 	cpsw_ale_stop(gbe_dev->ale);
-	cpsw_ale_destroy(gbe_dev->ale);
 	netcp_txpipe_close(&gbe_dev->tx_pipe);
 	free_secondary_ports(gbe_dev);
 
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 88ddfb92122b..3d940c67ea94 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -146,7 +146,6 @@ struct hv_netvsc_packet {
 
 struct netvsc_device_info {
 	unsigned char mac_adr[ETH_ALEN];
-	int  ring_size;
 	u32  num_chn;
 	u32  send_sections;
 	u32  recv_sections;
@@ -188,6 +187,9 @@ struct rndis_message;
 struct netvsc_device;
 struct net_device_context;
 
+extern u32 netvsc_ring_bytes;
+extern struct reciprocal_value netvsc_ring_reciprocal;
+
 struct netvsc_device *netvsc_device_add(struct hv_device *device,
 					const struct netvsc_device_info *info);
 int netvsc_alloc_recv_comp_ring(struct netvsc_device *net_device, u32 q_idx);
@@ -804,8 +806,6 @@ struct netvsc_device {
 
 	struct rndis_device *extension;
 
-	int ring_size;
-
 	u32 max_pkt; /* max number of pkt in one send, e.g. 8 */
 	u32 pkt_align; /* alignment bytes, e.g. 8 */
 
@@ -1425,32 +1425,6 @@ struct rndis_message {
 	(sizeof(msg) + (sizeof(struct rndis_message) -	\
 	 sizeof(union rndis_message_container)))
 
-/* get pointer to info buffer with message pointer */
-#define MESSAGE_TO_INFO_BUFFER(msg)				\
-	(((unsigned char *)(msg)) + msg->info_buf_offset)
-
-/* get pointer to status buffer with message pointer */
-#define MESSAGE_TO_STATUS_BUFFER(msg)			\
-	(((unsigned char *)(msg)) + msg->status_buf_offset)
-
-/* get pointer to OOBD buffer with message pointer */
-#define MESSAGE_TO_OOBD_BUFFER(msg)				\
-	(((unsigned char *)(msg)) + msg->oob_data_offset)
-
-/* get pointer to data buffer with message pointer */
-#define MESSAGE_TO_DATA_BUFFER(msg)				\
-	(((unsigned char *)(msg)) + msg->per_pkt_info_offset)
-
-/* get pointer to contained message from NDIS_MESSAGE pointer */
-#define RNDIS_MESSAGE_PTR_TO_MESSAGE_PTR(rndis_msg)		\
-	((void *) &rndis_msg->msg)
-
-/* get pointer to contained message from NDIS_MESSAGE pointer */
-#define RNDIS_MESSAGE_RAW_PTR_TO_MESSAGE_PTR(rndis_msg)	\
-	((void *) rndis_msg)
-
-
-
 #define RNDIS_HEADER_SIZE	(sizeof(struct rndis_message) - \
 				 sizeof(union rndis_message_container))
 
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index bfc79698b8f4..e4bcd202a56a 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -31,6 +31,7 @@
 #include <linux/vmalloc.h>
 #include <linux/rtnetlink.h>
 #include <linux/prefetch.h>
+#include <linux/reciprocal_div.h>
 
 #include <asm/sync_bitops.h>
 
@@ -588,14 +589,11 @@ void netvsc_device_remove(struct hv_device *device)
  * Get the percentage of available bytes to write in the ring.
  * The return value is in range from 0 to 100.
  */
-static inline u32 hv_ringbuf_avail_percent(
-		struct hv_ring_buffer_info *ring_info)
+static u32 hv_ringbuf_avail_percent(const struct hv_ring_buffer_info *ring_info)
 {
-	u32 avail_read, avail_write;
+	u32 avail_write = hv_get_bytes_to_write(ring_info);
 
-	hv_get_ringbuffer_availbytes(ring_info, &avail_read, &avail_write);
-
-	return avail_write * 100 / ring_info->ring_datasize;
+	return reciprocal_divide(avail_write  * 100, netvsc_ring_reciprocal);
 }
 
 static inline void netvsc_free_send_slot(struct netvsc_device *net_device,
@@ -712,11 +710,12 @@ static u32 netvsc_copy_to_send_buf(struct netvsc_device *net_device,
 	int i;
 	u32 msg_size = 0;
 	u32 padding = 0;
-	u32 remain = packet->total_data_buflen % net_device->pkt_align;
 	u32 page_count = packet->cp_partial ? packet->rmsg_pgcnt :
 		packet->page_buf_cnt;
+	u32 remain;
 
 	/* Add padding */
+	remain = packet->total_data_buflen & (net_device->pkt_align - 1);
 	if (skb->xmit_more && remain && !packet->cp_partial) {
 		padding = net_device->pkt_align - remain;
 		rndis_msg->msg_len += padding;
@@ -848,7 +847,6 @@ int netvsc_send(struct net_device_context *ndev_ctx,
 	struct hv_netvsc_packet *msd_send = NULL, *cur_send = NULL;
 	struct sk_buff *msd_skb = NULL;
 	bool try_batch;
-	bool xmit_more = (skb != NULL) ? skb->xmit_more : false;
 
 	/* If device is rescinded, return error and packet will get dropped. */
 	if (unlikely(!net_device || net_device->destroy))
@@ -922,7 +920,7 @@ int netvsc_send(struct net_device_context *ndev_ctx,
 		if (msdp->skb)
 			dev_consume_skb_any(msdp->skb);
 
-		if (xmit_more && !packet->cp_partial) {
+		if (skb->xmit_more && !packet->cp_partial) {
 			msdp->skb = skb;
 			msdp->pkt = packet;
 			msdp->count++;
@@ -1249,7 +1247,6 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 				const struct netvsc_device_info *device_info)
 {
 	int i, ret = 0;
-	int ring_size = device_info->ring_size;
 	struct netvsc_device *net_device;
 	struct net_device *ndev = hv_get_drvdata(device);
 	struct net_device_context *net_device_ctx = netdev_priv(ndev);
@@ -1261,8 +1258,6 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 	for (i = 0; i < VRSS_SEND_TAB_SIZE; i++)
 		net_device_ctx->tx_table[i] = 0;
 
-	net_device->ring_size = ring_size;
-
 	/* Because the device uses NAPI, all the interrupt batching and
 	 * control is done via Net softirq, not the channel handling
 	 */
@@ -1289,10 +1284,9 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device,
 		       netvsc_poll, NAPI_POLL_WEIGHT);
 
 	/* Open the channel */
-	ret = vmbus_open(device->channel, ring_size * PAGE_SIZE,
-			 ring_size * PAGE_SIZE, NULL, 0,
-			 netvsc_channel_cb,
-			 net_device->chan_table);
+	ret = vmbus_open(device->channel, netvsc_ring_bytes,
+			 netvsc_ring_bytes,  NULL, 0,
+			 netvsc_channel_cb, net_device->chan_table);
 
 	if (ret != 0) {
 		netif_napi_del(&net_device->chan_table[0].napi);
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 5129647d420c..dc70de674ca9 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -35,6 +35,7 @@
 #include <linux/slab.h>
 #include <linux/rtnetlink.h>
 #include <linux/netpoll.h>
+#include <linux/reciprocal_div.h>
 
 #include <net/arp.h>
 #include <net/route.h>
@@ -54,9 +55,11 @@
 #define LINKCHANGE_INT (2 * HZ)
 #define VF_TAKEOVER_INT (HZ / 10)
 
-static int ring_size = 128;
-module_param(ring_size, int, S_IRUGO);
+static unsigned int ring_size __ro_after_init = 128;
+module_param(ring_size, uint, S_IRUGO);
 MODULE_PARM_DESC(ring_size, "Ring buffer size (# of pages)");
+unsigned int netvsc_ring_bytes __ro_after_init;
+struct reciprocal_value netvsc_ring_reciprocal __ro_after_init;
 
 static const u32 default_msg = NETIF_MSG_DRV | NETIF_MSG_PROBE |
 				NETIF_MSG_LINK | NETIF_MSG_IFUP |
@@ -174,17 +177,15 @@ out:
 	return ret;
 }
 
-static void *init_ppi_data(struct rndis_message *msg, u32 ppi_size,
-			   int pkt_type)
+static inline void *init_ppi_data(struct rndis_message *msg,
+				  u32 ppi_size, u32 pkt_type)
 {
-	struct rndis_packet *rndis_pkt;
+	struct rndis_packet *rndis_pkt = &msg->msg.pkt;
 	struct rndis_per_packet_info *ppi;
 
-	rndis_pkt = &msg->msg.pkt;
 	rndis_pkt->data_offset += ppi_size;
-
-	ppi = (struct rndis_per_packet_info *)((void *)rndis_pkt +
-		rndis_pkt->per_pkt_info_offset + rndis_pkt->per_pkt_info_len);
+	ppi = (void *)rndis_pkt + rndis_pkt->per_pkt_info_offset
+		+ rndis_pkt->per_pkt_info_len;
 
 	ppi->size = ppi_size;
 	ppi->type = pkt_type;
@@ -192,7 +193,7 @@ static void *init_ppi_data(struct rndis_message *msg, u32 ppi_size,
 
 	rndis_pkt->per_pkt_info_len += ppi_size;
 
-	return ppi;
+	return ppi + 1;
 }
 
 /* Azure hosts don't support non-TCP port numbers in hashing for fragmented
@@ -469,10 +470,8 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 	int ret;
 	unsigned int num_data_pgs;
 	struct rndis_message *rndis_msg;
-	struct rndis_packet *rndis_pkt;
 	struct net_device *vf_netdev;
 	u32 rndis_msg_size;
-	struct rndis_per_packet_info *ppi;
 	u32 hash;
 	struct hv_page_buffer pb[MAX_PAGE_BUFFER_COUNT];
 
@@ -527,34 +526,36 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 
 	rndis_msg = (struct rndis_message *)skb->head;
 
-	memset(rndis_msg, 0, RNDIS_AND_PPI_SIZE);
-
 	/* Add the rndis header */
 	rndis_msg->ndis_msg_type = RNDIS_MSG_PACKET;
 	rndis_msg->msg_len = packet->total_data_buflen;
-	rndis_pkt = &rndis_msg->msg.pkt;
-	rndis_pkt->data_offset = sizeof(struct rndis_packet);
-	rndis_pkt->data_len = packet->total_data_buflen;
-	rndis_pkt->per_pkt_info_offset = sizeof(struct rndis_packet);
+
+	rndis_msg->msg.pkt = (struct rndis_packet) {
+		.data_offset = sizeof(struct rndis_packet),
+		.data_len = packet->total_data_buflen,
+		.per_pkt_info_offset = sizeof(struct rndis_packet),
+	};
 
 	rndis_msg_size = RNDIS_MESSAGE_SIZE(struct rndis_packet);
 
 	hash = skb_get_hash_raw(skb);
 	if (hash != 0 && net->real_num_tx_queues > 1) {
+		u32 *hash_info;
+
 		rndis_msg_size += NDIS_HASH_PPI_SIZE;
-		ppi = init_ppi_data(rndis_msg, NDIS_HASH_PPI_SIZE,
-				    NBL_HASH_VALUE);
-		*(u32 *)((void *)ppi + ppi->ppi_offset) = hash;
+		hash_info = init_ppi_data(rndis_msg, NDIS_HASH_PPI_SIZE,
+					  NBL_HASH_VALUE);
+		*hash_info = hash;
 	}
 
 	if (skb_vlan_tag_present(skb)) {
 		struct ndis_pkt_8021q_info *vlan;
 
 		rndis_msg_size += NDIS_VLAN_PPI_SIZE;
-		ppi = init_ppi_data(rndis_msg, NDIS_VLAN_PPI_SIZE,
-				    IEEE_8021Q_INFO);
+		vlan = init_ppi_data(rndis_msg, NDIS_VLAN_PPI_SIZE,
+				     IEEE_8021Q_INFO);
 
-		vlan = (void *)ppi + ppi->ppi_offset;
+		vlan->value = 0;
 		vlan->vlanid = skb->vlan_tci & VLAN_VID_MASK;
 		vlan->pri = (skb->vlan_tci & VLAN_PRIO_MASK) >>
 				VLAN_PRIO_SHIFT;
@@ -564,11 +565,10 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 		struct ndis_tcp_lso_info *lso_info;
 
 		rndis_msg_size += NDIS_LSO_PPI_SIZE;
-		ppi = init_ppi_data(rndis_msg, NDIS_LSO_PPI_SIZE,
-				    TCP_LARGESEND_PKTINFO);
-
-		lso_info = (void *)ppi + ppi->ppi_offset;
+		lso_info = init_ppi_data(rndis_msg, NDIS_LSO_PPI_SIZE,
+					 TCP_LARGESEND_PKTINFO);
 
+		lso_info->value = 0;
 		lso_info->lso_v2_transmit.type = NDIS_TCP_LARGE_SEND_OFFLOAD_V2_TYPE;
 		if (skb->protocol == htons(ETH_P_IP)) {
 			lso_info->lso_v2_transmit.ip_version =
@@ -593,12 +593,10 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 			struct ndis_tcp_ip_checksum_info *csum_info;
 
 			rndis_msg_size += NDIS_CSUM_PPI_SIZE;
-			ppi = init_ppi_data(rndis_msg, NDIS_CSUM_PPI_SIZE,
-					    TCPIP_CHKSUM_PKTINFO);
-
-			csum_info = (struct ndis_tcp_ip_checksum_info *)((void *)ppi +
-									 ppi->ppi_offset);
+			csum_info = init_ppi_data(rndis_msg, NDIS_CSUM_PPI_SIZE,
+						  TCPIP_CHKSUM_PKTINFO);
 
+			csum_info->value = 0;
 			csum_info->transmit.tcp_header_offset = skb_transport_offset(skb);
 
 			if (skb->protocol == htons(ETH_P_IP)) {
@@ -860,7 +858,6 @@ static int netvsc_set_channels(struct net_device *net,
 
 	memset(&device_info, 0, sizeof(device_info));
 	device_info.num_chn = count;
-	device_info.ring_size = ring_size;
 	device_info.send_sections = nvdev->send_section_cnt;
 	device_info.send_section_size = nvdev->send_section_size;
 	device_info.recv_sections = nvdev->recv_section_cnt;
@@ -975,7 +972,6 @@ static int netvsc_change_mtu(struct net_device *ndev, int mtu)
 		rndis_filter_close(nvdev);
 
 	memset(&device_info, 0, sizeof(device_info));
-	device_info.ring_size = ring_size;
 	device_info.num_chn = nvdev->num_chn;
 	device_info.send_sections = nvdev->send_section_cnt;
 	device_info.send_section_size = nvdev->send_section_size;
@@ -1539,7 +1535,6 @@ static int netvsc_set_ringparam(struct net_device *ndev,
 
 	memset(&device_info, 0, sizeof(device_info));
 	device_info.num_chn = nvdev->num_chn;
-	device_info.ring_size = ring_size;
 	device_info.send_sections = new_tx;
 	device_info.send_section_size = nvdev->send_section_size;
 	device_info.recv_sections = new_rx;
@@ -1995,7 +1990,6 @@ static int netvsc_probe(struct hv_device *dev,
 
 	/* Notify the netvsc driver of the new device */
 	memset(&device_info, 0, sizeof(device_info));
-	device_info.ring_size = ring_size;
 	device_info.num_chn = VRSS_CHANNEL_DEFAULT;
 	device_info.send_sections = NETVSC_DEFAULT_TX;
 	device_info.send_section_size = NETVSC_SEND_SECTION_SIZE;
@@ -2158,11 +2152,13 @@ static int __init netvsc_drv_init(void)
 
 	if (ring_size < RING_SIZE_MIN) {
 		ring_size = RING_SIZE_MIN;
-		pr_info("Increased ring_size to %d (min allowed)\n",
+		pr_info("Increased ring_size to %u (min allowed)\n",
 			ring_size);
 	}
-	ret = vmbus_driver_register(&netvsc_drv);
+	netvsc_ring_bytes = ring_size * PAGE_SIZE;
+	netvsc_ring_reciprocal = reciprocal_value(netvsc_ring_bytes);
 
+	ret = vmbus_driver_register(&netvsc_drv);
 	if (ret)
 		return ret;
 
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 7b637c7dd1e5..673492063307 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -1040,8 +1040,8 @@ static void netvsc_sc_open(struct vmbus_channel *new_sc)
 	/* Set the channel before opening.*/
 	nvchan->channel = new_sc;
 
-	ret = vmbus_open(new_sc, nvscdev->ring_size * PAGE_SIZE,
-			 nvscdev->ring_size * PAGE_SIZE, NULL, 0,
+	ret = vmbus_open(new_sc, netvsc_ring_bytes,
+			 netvsc_ring_bytes, NULL, 0,
 			 netvsc_channel_cb, nvchan);
 	if (ret == 0)
 		napi_enable(&nvchan->napi);
diff --git a/drivers/net/ieee802154/adf7242.c b/drivers/net/ieee802154/adf7242.c
index 400fdbd3a120..64f1b1e77bc0 100644
--- a/drivers/net/ieee802154/adf7242.c
+++ b/drivers/net/ieee802154/adf7242.c
@@ -1,7 +1,7 @@
 /*
  * Analog Devices ADF7242 Low-Power IEEE 802.15.4 Transceiver
  *
- * Copyright 2009-2015 Analog Devices Inc.
+ * Copyright 2009-2017 Analog Devices Inc.
  *
  * Licensed under the GPL-2 or later.
  *
@@ -344,12 +344,18 @@ static int adf7242_wait_status(struct adf7242_local *lp, unsigned int status,
 	return ret;
 }
 
-static int adf7242_wait_ready(struct adf7242_local *lp, int line)
+static int adf7242_wait_rc_ready(struct adf7242_local *lp, int line)
 {
 	return adf7242_wait_status(lp, STAT_RC_READY | STAT_SPI_READY,
 				   STAT_RC_READY | STAT_SPI_READY, line);
 }
 
+static int adf7242_wait_spi_ready(struct adf7242_local *lp, int line)
+{
+	return adf7242_wait_status(lp, STAT_SPI_READY,
+				   STAT_SPI_READY, line);
+}
+
 static int adf7242_write_fbuf(struct adf7242_local *lp, u8 *data, u8 len)
 {
 	u8 *buf = lp->buf;
@@ -369,7 +375,7 @@ static int adf7242_write_fbuf(struct adf7242_local *lp, u8 *data, u8 len)
 	spi_message_add_tail(&xfer_head, &msg);
 	spi_message_add_tail(&xfer_buf, &msg);
 
-	adf7242_wait_ready(lp, __LINE__);
+	adf7242_wait_spi_ready(lp, __LINE__);
 
 	mutex_lock(&lp->bmux);
 	buf[0] = CMD_SPI_PKT_WR;
@@ -401,7 +407,7 @@ static int adf7242_read_fbuf(struct adf7242_local *lp,
 	spi_message_add_tail(&xfer_head, &msg);
 	spi_message_add_tail(&xfer_buf, &msg);
 
-	adf7242_wait_ready(lp, __LINE__);
+	adf7242_wait_spi_ready(lp, __LINE__);
 
 	mutex_lock(&lp->bmux);
 	if (packet_read) {
@@ -432,7 +438,7 @@ static int adf7242_read_reg(struct adf7242_local *lp, u16 addr, u8 *data)
 		.rx_buf = lp->buf_read_rx,
 	};
 
-	adf7242_wait_ready(lp, __LINE__);
+	adf7242_wait_spi_ready(lp, __LINE__);
 
 	mutex_lock(&lp->bmux);
 	lp->buf_read_tx[0] = CMD_SPI_MEM_RD(addr);
@@ -462,7 +468,7 @@ static int adf7242_write_reg(struct adf7242_local *lp, u16 addr, u8 data)
 {
 	int status;
 
-	adf7242_wait_ready(lp, __LINE__);
+	adf7242_wait_spi_ready(lp, __LINE__);
 
 	mutex_lock(&lp->bmux);
 	lp->buf_reg_tx[0] = CMD_SPI_MEM_WR(addr);
@@ -484,7 +490,7 @@ static int adf7242_cmd(struct adf7242_local *lp, unsigned int cmd)
 	dev_vdbg(&lp->spi->dev, "%s : CMD=0x%X\n", __func__, cmd);
 
 	if (cmd != CMD_RC_PC_RESET_NO_WAIT)
-		adf7242_wait_ready(lp, __LINE__);
+		adf7242_wait_rc_ready(lp, __LINE__);
 
 	mutex_lock(&lp->bmux);
 	lp->buf_cmd = cmd;
@@ -557,6 +563,22 @@ static int adf7242_verify_firmware(struct adf7242_local *lp,
 	return 0;
 }
 
+static void adf7242_clear_irqstat(struct adf7242_local *lp)
+{
+	adf7242_write_reg(lp, REG_IRQ1_SRC1, IRQ_CCA_COMPLETE | IRQ_SFD_RX |
+			  IRQ_SFD_TX | IRQ_RX_PKT_RCVD | IRQ_TX_PKT_SENT |
+			  IRQ_FRAME_VALID | IRQ_ADDRESS_VALID | IRQ_CSMA_CA);
+}
+
+static int adf7242_cmd_rx(struct adf7242_local *lp)
+{
+	/* Wait until the ACK is sent */
+	adf7242_wait_status(lp, RC_STATUS_PHY_RDY, RC_STATUS_MASK, __LINE__);
+	adf7242_clear_irqstat(lp);
+
+	return adf7242_cmd(lp, CMD_RC_RX);
+}
+
 static int adf7242_set_txpower(struct ieee802154_hw *hw, int mbm)
 {
 	struct adf7242_local *lp = hw->priv;
@@ -660,7 +682,7 @@ static int adf7242_start(struct ieee802154_hw *hw)
 	struct adf7242_local *lp = hw->priv;
 
 	adf7242_cmd(lp, CMD_RC_PHY_RDY);
-	adf7242_write_reg(lp, REG_IRQ1_SRC1, 0xFF);
+	adf7242_clear_irqstat(lp);
 	enable_irq(lp->spi->irq);
 	set_bit(FLAG_START, &lp->flags);
 
@@ -671,10 +693,10 @@ static void adf7242_stop(struct ieee802154_hw *hw)
 {
 	struct adf7242_local *lp = hw->priv;
 
+	disable_irq(lp->spi->irq);
 	adf7242_cmd(lp, CMD_RC_IDLE);
 	clear_bit(FLAG_START, &lp->flags);
-	disable_irq(lp->spi->irq);
-	adf7242_write_reg(lp, REG_IRQ1_SRC1, 0xFF);
+	adf7242_clear_irqstat(lp);
 }
 
 static int adf7242_channel(struct ieee802154_hw *hw, u8 page, u8 channel)
@@ -789,9 +811,12 @@ static int adf7242_xmit(struct ieee802154_hw *hw, struct sk_buff *skb)
 	struct adf7242_local *lp = hw->priv;
 	int ret;
 
+	/* ensure existing instances of the IRQ handler have completed */
+	disable_irq(lp->spi->irq);
 	set_bit(FLAG_XMIT, &lp->flags);
 	reinit_completion(&lp->tx_complete);
 	adf7242_cmd(lp, CMD_RC_PHY_RDY);
+	adf7242_clear_irqstat(lp);
 
 	ret = adf7242_write_fbuf(lp, skb->data, skb->len);
 	if (ret)
@@ -800,6 +825,7 @@ static int adf7242_xmit(struct ieee802154_hw *hw, struct sk_buff *skb)
 	ret = adf7242_cmd(lp, CMD_RC_CSMACA);
 	if (ret)
 		goto err;
+	enable_irq(lp->spi->irq);
 
 	ret = wait_for_completion_interruptible_timeout(&lp->tx_complete,
 							HZ / 10);
@@ -822,7 +848,7 @@ static int adf7242_xmit(struct ieee802154_hw *hw, struct sk_buff *skb)
 
 err:
 	clear_bit(FLAG_XMIT, &lp->flags);
-	adf7242_cmd(lp, CMD_RC_RX);
+	adf7242_cmd_rx(lp);
 
 	return ret;
 }
@@ -846,7 +872,7 @@ static int adf7242_rx(struct adf7242_local *lp)
 
 	skb = dev_alloc_skb(len);
 	if (!skb) {
-		adf7242_cmd(lp, CMD_RC_RX);
+		adf7242_cmd_rx(lp);
 		return -ENOMEM;
 	}
 
@@ -854,14 +880,14 @@ static int adf7242_rx(struct adf7242_local *lp)
 	ret = adf7242_read_fbuf(lp, data, len, true);
 	if (ret < 0) {
 		kfree_skb(skb);
-		adf7242_cmd(lp, CMD_RC_RX);
+		adf7242_cmd_rx(lp);
 		return ret;
 	}
 
 	lqi = data[len - 2];
 	lp->rssi = data[len - 1];
 
-	adf7242_cmd(lp, CMD_RC_RX);
+	ret = adf7242_cmd_rx(lp);
 
 	skb_trim(skb, len - 2);	/* Don't put RSSI/LQI or CRC into the frame */
 
@@ -870,7 +896,7 @@ static int adf7242_rx(struct adf7242_local *lp)
 	dev_dbg(&lp->spi->dev, "%s: ret=%d len=%d lqi=%d rssi=%d\n",
 		__func__, ret, (int)len, (int)lqi, lp->rssi);
 
-	return 0;
+	return ret;
 }
 
 static const struct ieee802154_ops adf7242_ops = {
@@ -888,7 +914,7 @@ static const struct ieee802154_ops adf7242_ops = {
 	.set_cca_ed_level = adf7242_set_cca_ed_level,
 };
 
-static void adf7242_debug(u8 irq1)
+static void adf7242_debug(struct adf7242_local *lp, u8 irq1)
 {
 #ifdef DEBUG
 	u8 stat;
@@ -906,9 +932,12 @@ static void adf7242_debug(u8 irq1)
 		irq1 & IRQ_FRAME_VALID ? "IRQ_FRAME_VALID\n" : "",
 		irq1 & IRQ_ADDRESS_VALID ? "IRQ_ADDRESS_VALID\n" : "");
 
-	dev_dbg(&lp->spi->dev, "%s STATUS = %X:\n%s\n%s%s%s%s%s\n",
+	dev_dbg(&lp->spi->dev, "%s STATUS = %X:\n%s\n%s\n%s\n%s\n%s%s%s%s%s\n",
 		__func__, stat,
+		stat & STAT_SPI_READY ? "SPI_READY" : "SPI_BUSY",
+		stat & STAT_IRQ_STATUS ? "IRQ_PENDING" : "IRQ_CLEAR",
 		stat & STAT_RC_READY ? "RC_READY" : "RC_BUSY",
+		stat & STAT_CCA_RESULT ? "CHAN_IDLE" : "CHAN_BUSY",
 		(stat & 0xf) == RC_STATUS_IDLE ? "RC_STATUS_IDLE" : "",
 		(stat & 0xf) == RC_STATUS_MEAS ? "RC_STATUS_MEAS" : "",
 		(stat & 0xf) == RC_STATUS_PHY_RDY ? "RC_STATUS_PHY_RDY" : "",
@@ -923,20 +952,20 @@ static irqreturn_t adf7242_isr(int irq, void *data)
 	unsigned int xmit;
 	u8 irq1;
 
-	adf7242_wait_status(lp, RC_STATUS_PHY_RDY, RC_STATUS_MASK, __LINE__);
-
 	adf7242_read_reg(lp, REG_IRQ1_SRC1, &irq1);
-	adf7242_write_reg(lp, REG_IRQ1_SRC1, irq1);
 
 	if (!(irq1 & (IRQ_RX_PKT_RCVD | IRQ_CSMA_CA)))
 		dev_err(&lp->spi->dev, "%s :ERROR IRQ1 = 0x%X\n",
 			__func__, irq1);
 
-	adf7242_debug(irq1);
+	adf7242_debug(lp, irq1);
 
 	xmit = test_bit(FLAG_XMIT, &lp->flags);
 
 	if (xmit && (irq1 & IRQ_CSMA_CA)) {
+		adf7242_wait_status(lp, RC_STATUS_PHY_RDY,
+				    RC_STATUS_MASK, __LINE__);
+
 		if (ADF7242_REPORT_CSMA_CA_STAT) {
 			u8 astat;
 
@@ -957,6 +986,7 @@ static irqreturn_t adf7242_isr(int irq, void *data)
 			lp->tx_stat = SUCCESS;
 		}
 		complete(&lp->tx_complete);
+		adf7242_clear_irqstat(lp);
 	} else if (!xmit && (irq1 & IRQ_RX_PKT_RCVD) &&
 		   (irq1 & IRQ_FRAME_VALID)) {
 		adf7242_rx(lp);
@@ -965,16 +995,19 @@ static irqreturn_t adf7242_isr(int irq, void *data)
 		dev_dbg(&lp->spi->dev, "%s:%d : ERROR IRQ1 = 0x%X\n",
 			__func__, __LINE__, irq1);
 		adf7242_cmd(lp, CMD_RC_PHY_RDY);
-		adf7242_write_reg(lp, REG_IRQ1_SRC1, 0xFF);
-		adf7242_cmd(lp, CMD_RC_RX);
+		adf7242_cmd_rx(lp);
 	} else {
 		/* This can only be xmit without IRQ, likely a RX packet.
 		 * we get an TX IRQ shortly - do nothing or let the xmit
 		 * timeout handle this
 		 */
+
 		dev_dbg(&lp->spi->dev, "%s:%d : ERROR IRQ1 = 0x%X, xmit %d\n",
 			__func__, __LINE__, irq1, xmit);
+		adf7242_wait_status(lp, RC_STATUS_PHY_RDY,
+				    RC_STATUS_MASK, __LINE__);
 		complete(&lp->tx_complete);
+		adf7242_clear_irqstat(lp);
 	}
 
 	return IRQ_HANDLED;
@@ -994,7 +1027,7 @@ static int adf7242_soft_reset(struct adf7242_local *lp, int line)
 	adf7242_set_promiscuous_mode(lp->hw, lp->promiscuous);
 	adf7242_set_csma_params(lp->hw, lp->min_be, lp->max_be,
 				lp->max_cca_retries);
-	adf7242_write_reg(lp, REG_IRQ1_SRC1, 0xFF);
+	adf7242_clear_irqstat(lp);
 
 	if (test_bit(FLAG_START, &lp->flags)) {
 		enable_irq(lp->spi->irq);
@@ -1060,7 +1093,7 @@ static int adf7242_hw_init(struct adf7242_local *lp)
 	adf7242_write_reg(lp, REG_IRQ1_EN0, 0);
 	adf7242_write_reg(lp, REG_IRQ1_EN1, IRQ_RX_PKT_RCVD | IRQ_CSMA_CA);
 
-	adf7242_write_reg(lp, REG_IRQ1_SRC1, 0xFF);
+	adf7242_clear_irqstat(lp);
 	adf7242_write_reg(lp, REG_IRQ1_SRC0, 0xFF);
 
 	adf7242_cmd(lp, CMD_RC_IDLE);
@@ -1086,8 +1119,11 @@ static int adf7242_stats_show(struct seq_file *file, void *offset)
 		   irq1 & IRQ_FRAME_VALID ? "IRQ_FRAME_VALID\n" : "",
 		   irq1 & IRQ_ADDRESS_VALID ? "IRQ_ADDRESS_VALID\n" : "");
 
-	seq_printf(file, "STATUS = %X:\n%s\n%s%s%s%s%s\n", stat,
+	seq_printf(file, "STATUS = %X:\n%s\n%s\n%s\n%s\n%s%s%s%s%s\n", stat,
+		   stat & STAT_SPI_READY ? "SPI_READY" : "SPI_BUSY",
+		   stat & STAT_IRQ_STATUS ? "IRQ_PENDING" : "IRQ_CLEAR",
 		   stat & STAT_RC_READY ? "RC_READY" : "RC_BUSY",
+		   stat & STAT_CCA_RESULT ? "CHAN_IDLE" : "CHAN_BUSY",
 		   (stat & 0xf) == RC_STATUS_IDLE ? "RC_STATUS_IDLE" : "",
 		   (stat & 0xf) == RC_STATUS_MEAS ? "RC_STATUS_MEAS" : "",
 		   (stat & 0xf) == RC_STATUS_PHY_RDY ? "RC_STATUS_PHY_RDY" : "",
@@ -1257,12 +1293,14 @@ static int adf7242_remove(struct spi_device *spi)
 
 static const struct of_device_id adf7242_of_match[] = {
 	{ .compatible = "adi,adf7242", },
+	{ .compatible = "adi,adf7241", },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, adf7242_of_match);
 
 static const struct spi_device_id adf7242_device_id[] = {
 	{ .name = "adf7242", },
+	{ .name = "adf7241", },
 	{ },
 };
 MODULE_DEVICE_TABLE(spi, adf7242_device_id);
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
index 77cc4fbaeace..9774c96ac7bb 100644
--- a/drivers/net/ipvlan/ipvlan_core.c
+++ b/drivers/net/ipvlan/ipvlan_core.c
@@ -664,8 +664,6 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb,
 	struct sk_buff *skb = *pskb;
 	struct ethhdr *eth = eth_hdr(skb);
 	rx_handler_result_t ret = RX_HANDLER_PASS;
-	void *lyr3h;
-	int addr_type;
 
 	if (is_multicast_ether_addr(eth->h_dest)) {
 		if (ipvlan_external_frame(skb, port)) {
@@ -683,15 +681,8 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb,
 			}
 		}
 	} else {
-		struct ipvl_addr *addr;
-
-		lyr3h = ipvlan_get_L3_hdr(port, skb, &addr_type);
-		if (!lyr3h)
-			return ret;
-
-		addr = ipvlan_addr_lookup(port, lyr3h, addr_type, true);
-		if (addr)
-			ret = ipvlan_rcv_frame(addr, pskb, false);
+		/* Perform like l3 mode for non-multicast packet */
+		ret = ipvlan_handle_mode_l3(pskb, port);
 	}
 
 	return ret;
diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
index 30cb803e2fe5..2469df118fbf 100644
--- a/drivers/net/ipvlan/ipvlan_main.c
+++ b/drivers/net/ipvlan/ipvlan_main.c
@@ -850,6 +850,19 @@ static void ipvlan_del_addr6(struct ipvl_dev *ipvlan, struct in6_addr *ip6_addr)
 	return ipvlan_del_addr(ipvlan, ip6_addr, true);
 }
 
+static bool ipvlan_is_valid_dev(const struct net_device *dev)
+{
+	struct ipvl_dev *ipvlan = netdev_priv(dev);
+
+	if (!netif_is_ipvlan(dev))
+		return false;
+
+	if (!ipvlan || !ipvlan->port)
+		return false;
+
+	return true;
+}
+
 static int ipvlan_addr6_event(struct notifier_block *unused,
 			      unsigned long event, void *ptr)
 {
@@ -857,10 +870,7 @@ static int ipvlan_addr6_event(struct notifier_block *unused,
 	struct net_device *dev = (struct net_device *)if6->idev->dev;
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
 
-	if (!netif_is_ipvlan(dev))
-		return NOTIFY_DONE;
-
-	if (!ipvlan || !ipvlan->port)
+	if (!ipvlan_is_valid_dev(dev))
 		return NOTIFY_DONE;
 
 	switch (event) {
@@ -888,10 +898,7 @@ static int ipvlan_addr6_validator_event(struct notifier_block *unused,
 	if (in_softirq())
 		return NOTIFY_DONE;
 
-	if (!netif_is_ipvlan(dev))
-		return NOTIFY_DONE;
-
-	if (!ipvlan || !ipvlan->port)
+	if (!ipvlan_is_valid_dev(dev))
 		return NOTIFY_DONE;
 
 	switch (event) {
@@ -932,10 +939,7 @@ static int ipvlan_addr4_event(struct notifier_block *unused,
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
 	struct in_addr ip4_addr;
 
-	if (!netif_is_ipvlan(dev))
-		return NOTIFY_DONE;
-
-	if (!ipvlan || !ipvlan->port)
+	if (!ipvlan_is_valid_dev(dev))
 		return NOTIFY_DONE;
 
 	switch (event) {
@@ -961,10 +965,7 @@ static int ipvlan_addr4_validator_event(struct notifier_block *unused,
 	struct net_device *dev = (struct net_device *)ivi->ivi_dev->dev;
 	struct ipvl_dev *ipvlan = netdev_priv(dev);
 
-	if (!netif_is_ipvlan(dev))
-		return NOTIFY_DONE;
-
-	if (!ipvlan || !ipvlan->port)
+	if (!ipvlan_is_valid_dev(dev))
 		return NOTIFY_DONE;
 
 	switch (event) {
diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
new file mode 100644
index 000000000000..074ddebbc41d
--- /dev/null
+++ b/drivers/net/netdevsim/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_NETDEVSIM) += netdevsim.o
+
+netdevsim-objs := \
+	netdev.o \
+	bpf.o \
diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c
new file mode 100644
index 000000000000..078d2c37a6c1
--- /dev/null
+++ b/drivers/net/netdevsim/bpf.c
@@ -0,0 +1,373 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include <linux/bpf.h>
+#include <linux/bpf_verifier.h>
+#include <linux/debugfs.h>
+#include <linux/kernel.h>
+#include <linux/rtnetlink.h>
+#include <net/pkt_cls.h>
+
+#include "netdevsim.h"
+
+struct nsim_bpf_bound_prog {
+	struct netdevsim *ns;
+	struct bpf_prog *prog;
+	struct dentry *ddir;
+	const char *state;
+	bool is_loaded;
+	struct list_head l;
+};
+
+static int nsim_debugfs_bpf_string_read(struct seq_file *file, void *data)
+{
+	const char **str = file->private;
+
+	if (*str)
+		seq_printf(file, "%s\n", *str);
+
+	return 0;
+}
+
+static int nsim_debugfs_bpf_string_open(struct inode *inode, struct file *f)
+{
+	return single_open(f, nsim_debugfs_bpf_string_read, inode->i_private);
+}
+
+static const struct file_operations nsim_bpf_string_fops = {
+	.owner = THIS_MODULE,
+	.open = nsim_debugfs_bpf_string_open,
+	.release = single_release,
+	.read = seq_read,
+	.llseek = seq_lseek
+};
+
+static int
+nsim_bpf_verify_insn(struct bpf_verifier_env *env, int insn_idx, int prev_insn)
+{
+	struct nsim_bpf_bound_prog *state;
+
+	state = env->prog->aux->offload->dev_priv;
+	if (state->ns->bpf_bind_verifier_delay && !insn_idx)
+		msleep(state->ns->bpf_bind_verifier_delay);
+
+	return 0;
+}
+
+static const struct bpf_ext_analyzer_ops nsim_bpf_analyzer_ops = {
+	.insn_hook = nsim_bpf_verify_insn,
+};
+
+static bool nsim_xdp_offload_active(struct netdevsim *ns)
+{
+	return ns->xdp_prog_mode == XDP_ATTACHED_HW;
+}
+
+static void nsim_prog_set_loaded(struct bpf_prog *prog, bool loaded)
+{
+	struct nsim_bpf_bound_prog *state;
+
+	if (!prog || !prog->aux->offload)
+		return;
+
+	state = prog->aux->offload->dev_priv;
+	state->is_loaded = loaded;
+}
+
+static int
+nsim_bpf_offload(struct netdevsim *ns, struct bpf_prog *prog, bool oldprog)
+{
+	nsim_prog_set_loaded(ns->bpf_offloaded, false);
+
+	WARN(!!ns->bpf_offloaded != oldprog,
+	     "bad offload state, expected offload %sto be active",
+	     oldprog ? "" : "not ");
+	ns->bpf_offloaded = prog;
+	ns->bpf_offloaded_id = prog ? prog->aux->id : 0;
+	nsim_prog_set_loaded(prog, true);
+
+	return 0;
+}
+
+int nsim_bpf_setup_tc_block_cb(enum tc_setup_type type,
+			       void *type_data, void *cb_priv)
+{
+	struct tc_cls_bpf_offload *cls_bpf = type_data;
+	struct bpf_prog *prog = cls_bpf->prog;
+	struct netdevsim *ns = cb_priv;
+	bool skip_sw;
+
+	if (type != TC_SETUP_CLSBPF ||
+	    !tc_can_offload(ns->netdev) ||
+	    cls_bpf->common.protocol != htons(ETH_P_ALL) ||
+	    cls_bpf->common.chain_index)
+		return -EOPNOTSUPP;
+
+	skip_sw = cls_bpf->gen_flags & TCA_CLS_FLAGS_SKIP_SW;
+
+	if (nsim_xdp_offload_active(ns))
+		return -EBUSY;
+
+	if (!ns->bpf_tc_accept)
+		return -EOPNOTSUPP;
+	/* Note: progs without skip_sw will probably not be dev bound */
+	if (prog && !prog->aux->offload && !ns->bpf_tc_non_bound_accept)
+		return -EOPNOTSUPP;
+
+	switch (cls_bpf->command) {
+	case TC_CLSBPF_REPLACE:
+		return nsim_bpf_offload(ns, prog, true);
+	case TC_CLSBPF_ADD:
+		return nsim_bpf_offload(ns, prog, false);
+	case TC_CLSBPF_DESTROY:
+		return nsim_bpf_offload(ns, NULL, true);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+int nsim_bpf_disable_tc(struct netdevsim *ns)
+{
+	if (ns->bpf_offloaded && !nsim_xdp_offload_active(ns))
+		return -EBUSY;
+	return 0;
+}
+
+static int nsim_xdp_offload_prog(struct netdevsim *ns, struct netdev_bpf *bpf)
+{
+	if (!nsim_xdp_offload_active(ns) && !bpf->prog)
+		return 0;
+	if (!nsim_xdp_offload_active(ns) && bpf->prog && ns->bpf_offloaded) {
+		NSIM_EA(bpf->extack, "TC program is already loaded");
+		return -EBUSY;
+	}
+
+	return nsim_bpf_offload(ns, bpf->prog, nsim_xdp_offload_active(ns));
+}
+
+static int nsim_xdp_set_prog(struct netdevsim *ns, struct netdev_bpf *bpf)
+{
+	int err;
+
+	if (ns->xdp_prog && (bpf->flags ^ ns->xdp_flags) & XDP_FLAGS_MODES) {
+		NSIM_EA(bpf->extack, "program loaded with different flags");
+		return -EBUSY;
+	}
+
+	if (bpf->command == XDP_SETUP_PROG && !ns->bpf_xdpdrv_accept) {
+		NSIM_EA(bpf->extack, "driver XDP disabled in DebugFS");
+		return -EOPNOTSUPP;
+	}
+	if (bpf->command == XDP_SETUP_PROG_HW && !ns->bpf_xdpoffload_accept) {
+		NSIM_EA(bpf->extack, "XDP offload disabled in DebugFS");
+		return -EOPNOTSUPP;
+	}
+
+	if (bpf->command == XDP_SETUP_PROG_HW) {
+		err = nsim_xdp_offload_prog(ns, bpf);
+		if (err)
+			return err;
+	}
+
+	if (ns->xdp_prog)
+		bpf_prog_put(ns->xdp_prog);
+
+	ns->xdp_prog = bpf->prog;
+	ns->xdp_flags = bpf->flags;
+
+	if (!bpf->prog)
+		ns->xdp_prog_mode = XDP_ATTACHED_NONE;
+	else if (bpf->command == XDP_SETUP_PROG)
+		ns->xdp_prog_mode = XDP_ATTACHED_DRV;
+	else
+		ns->xdp_prog_mode = XDP_ATTACHED_HW;
+
+	return 0;
+}
+
+static int nsim_bpf_create_prog(struct netdevsim *ns, struct bpf_prog *prog)
+{
+	struct nsim_bpf_bound_prog *state;
+	char name[16];
+	int err;
+
+	state = kzalloc(sizeof(*state), GFP_KERNEL);
+	if (!state)
+		return -ENOMEM;
+
+	state->ns = ns;
+	state->prog = prog;
+	state->state = "verify";
+
+	/* Program id is not populated yet when we create the state. */
+	sprintf(name, "%u", ns->prog_id_gen++);
+	state->ddir = debugfs_create_dir(name, ns->ddir_bpf_bound_progs);
+	if (IS_ERR(state->ddir)) {
+		err = PTR_ERR(state->ddir);
+		kfree(state);
+		return err;
+	}
+
+	debugfs_create_u32("id", 0400, state->ddir, &prog->aux->id);
+	debugfs_create_file("state", 0400, state->ddir,
+			    &state->state, &nsim_bpf_string_fops);
+	debugfs_create_bool("loaded", 0400, state->ddir, &state->is_loaded);
+
+	list_add_tail(&state->l, &ns->bpf_bound_progs);
+
+	prog->aux->offload->dev_priv = state;
+
+	return 0;
+}
+
+static void nsim_bpf_destroy_prog(struct bpf_prog *prog)
+{
+	struct nsim_bpf_bound_prog *state;
+
+	state = prog->aux->offload->dev_priv;
+	WARN(state->is_loaded,
+	     "offload state destroyed while program still bound");
+	debugfs_remove_recursive(state->ddir);
+	list_del(&state->l);
+	kfree(state);
+}
+
+static int nsim_setup_prog_checks(struct netdevsim *ns, struct netdev_bpf *bpf)
+{
+	if (bpf->prog && bpf->prog->aux->offload) {
+		NSIM_EA(bpf->extack, "attempt to load offloaded prog to drv");
+		return -EINVAL;
+	}
+	if (ns->netdev->mtu > NSIM_XDP_MAX_MTU) {
+		NSIM_EA(bpf->extack, "MTU too large w/ XDP enabled");
+		return -EINVAL;
+	}
+	if (nsim_xdp_offload_active(ns)) {
+		NSIM_EA(bpf->extack, "xdp offload active, can't load drv prog");
+		return -EBUSY;
+	}
+	return 0;
+}
+
+static int
+nsim_setup_prog_hw_checks(struct netdevsim *ns, struct netdev_bpf *bpf)
+{
+	struct nsim_bpf_bound_prog *state;
+
+	if (!bpf->prog)
+		return 0;
+
+	if (!bpf->prog->aux->offload) {
+		NSIM_EA(bpf->extack, "xdpoffload of non-bound program");
+		return -EINVAL;
+	}
+	if (bpf->prog->aux->offload->netdev != ns->netdev) {
+		NSIM_EA(bpf->extack, "program bound to different dev");
+		return -EINVAL;
+	}
+
+	state = bpf->prog->aux->offload->dev_priv;
+	if (WARN_ON(strcmp(state->state, "xlated"))) {
+		NSIM_EA(bpf->extack, "offloading program in bad state");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+int nsim_bpf(struct net_device *dev, struct netdev_bpf *bpf)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+	struct nsim_bpf_bound_prog *state;
+	int err;
+
+	ASSERT_RTNL();
+
+	switch (bpf->command) {
+	case BPF_OFFLOAD_VERIFIER_PREP:
+		if (!ns->bpf_bind_accept)
+			return -EOPNOTSUPP;
+
+		err = nsim_bpf_create_prog(ns, bpf->verifier.prog);
+		if (err)
+			return err;
+
+		bpf->verifier.ops = &nsim_bpf_analyzer_ops;
+		return 0;
+	case BPF_OFFLOAD_TRANSLATE:
+		state = bpf->offload.prog->aux->offload->dev_priv;
+
+		state->state = "xlated";
+		return 0;
+	case BPF_OFFLOAD_DESTROY:
+		nsim_bpf_destroy_prog(bpf->offload.prog);
+		return 0;
+	case XDP_QUERY_PROG:
+		bpf->prog_attached = ns->xdp_prog_mode;
+		bpf->prog_id = ns->xdp_prog ? ns->xdp_prog->aux->id : 0;
+		bpf->prog_flags = ns->xdp_prog ? ns->xdp_flags : 0;
+		return 0;
+	case XDP_SETUP_PROG:
+		err = nsim_setup_prog_checks(ns, bpf);
+		if (err)
+			return err;
+
+		return nsim_xdp_set_prog(ns, bpf);
+	case XDP_SETUP_PROG_HW:
+		err = nsim_setup_prog_hw_checks(ns, bpf);
+		if (err)
+			return err;
+
+		return nsim_xdp_set_prog(ns, bpf);
+	default:
+		return -EINVAL;
+	}
+}
+
+int nsim_bpf_init(struct netdevsim *ns)
+{
+	INIT_LIST_HEAD(&ns->bpf_bound_progs);
+
+	debugfs_create_u32("bpf_offloaded_id", 0400, ns->ddir,
+			   &ns->bpf_offloaded_id);
+
+	ns->bpf_bind_accept = true;
+	debugfs_create_bool("bpf_bind_accept", 0600, ns->ddir,
+			    &ns->bpf_bind_accept);
+	debugfs_create_u32("bpf_bind_verifier_delay", 0600, ns->ddir,
+			   &ns->bpf_bind_verifier_delay);
+	ns->ddir_bpf_bound_progs =
+		debugfs_create_dir("bpf_bound_progs", ns->ddir);
+
+	ns->bpf_tc_accept = true;
+	debugfs_create_bool("bpf_tc_accept", 0600, ns->ddir,
+			    &ns->bpf_tc_accept);
+	debugfs_create_bool("bpf_tc_non_bound_accept", 0600, ns->ddir,
+			    &ns->bpf_tc_non_bound_accept);
+	ns->bpf_xdpdrv_accept = true;
+	debugfs_create_bool("bpf_xdpdrv_accept", 0600, ns->ddir,
+			    &ns->bpf_xdpdrv_accept);
+	ns->bpf_xdpoffload_accept = true;
+	debugfs_create_bool("bpf_xdpoffload_accept", 0600, ns->ddir,
+			    &ns->bpf_xdpoffload_accept);
+
+	return 0;
+}
+
+void nsim_bpf_uninit(struct netdevsim *ns)
+{
+	WARN_ON(!list_empty(&ns->bpf_bound_progs));
+	WARN_ON(ns->xdp_prog);
+	WARN_ON(ns->bpf_offloaded);
+}
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
new file mode 100644
index 000000000000..eb8c679fca9f
--- /dev/null
+++ b/drivers/net/netdevsim/netdev.c
@@ -0,0 +1,502 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include <linux/debugfs.h>
+#include <linux/etherdevice.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/slab.h>
+#include <net/netlink.h>
+#include <net/pkt_cls.h>
+#include <net/rtnetlink.h>
+
+#include "netdevsim.h"
+
+struct nsim_vf_config {
+	int link_state;
+	u16 min_tx_rate;
+	u16 max_tx_rate;
+	u16 vlan;
+	__be16 vlan_proto;
+	u16 qos;
+	u8 vf_mac[ETH_ALEN];
+	bool spoofchk_enabled;
+	bool trusted;
+	bool rss_query_enabled;
+};
+
+static u32 nsim_dev_id;
+
+static int nsim_num_vf(struct device *dev)
+{
+	struct netdevsim *ns = to_nsim(dev);
+
+	return ns->num_vfs;
+}
+
+static struct bus_type nsim_bus = {
+	.name		= DRV_NAME,
+	.dev_name	= DRV_NAME,
+	.num_vf		= nsim_num_vf,
+};
+
+static int nsim_vfs_enable(struct netdevsim *ns, unsigned int num_vfs)
+{
+	ns->vfconfigs = kcalloc(num_vfs, sizeof(struct nsim_vf_config),
+				GFP_KERNEL);
+	if (!ns->vfconfigs)
+		return -ENOMEM;
+	ns->num_vfs = num_vfs;
+
+	return 0;
+}
+
+static void nsim_vfs_disable(struct netdevsim *ns)
+{
+	kfree(ns->vfconfigs);
+	ns->vfconfigs = NULL;
+	ns->num_vfs = 0;
+}
+
+static ssize_t
+nsim_numvfs_store(struct device *dev, struct device_attribute *attr,
+		  const char *buf, size_t count)
+{
+	struct netdevsim *ns = to_nsim(dev);
+	unsigned int num_vfs;
+	int ret;
+
+	ret = kstrtouint(buf, 0, &num_vfs);
+	if (ret)
+		return ret;
+
+	rtnl_lock();
+	if (ns->num_vfs == num_vfs)
+		goto exit_good;
+	if (ns->num_vfs && num_vfs) {
+		ret = -EBUSY;
+		goto exit_unlock;
+	}
+
+	if (num_vfs) {
+		ret = nsim_vfs_enable(ns, num_vfs);
+		if (ret)
+			goto exit_unlock;
+	} else {
+		nsim_vfs_disable(ns);
+	}
+exit_good:
+	ret = count;
+exit_unlock:
+	rtnl_unlock();
+
+	return ret;
+}
+
+static ssize_t
+nsim_numvfs_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	struct netdevsim *ns = to_nsim(dev);
+
+	return sprintf(buf, "%u\n", ns->num_vfs);
+}
+
+static struct device_attribute nsim_numvfs_attr =
+	__ATTR(sriov_numvfs, 0664, nsim_numvfs_show, nsim_numvfs_store);
+
+static struct attribute *nsim_dev_attrs[] = {
+	&nsim_numvfs_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group nsim_dev_attr_group = {
+	.attrs = nsim_dev_attrs,
+};
+
+static const struct attribute_group *nsim_dev_attr_groups[] = {
+	&nsim_dev_attr_group,
+	NULL,
+};
+
+static void nsim_dev_release(struct device *dev)
+{
+	struct netdevsim *ns = to_nsim(dev);
+
+	nsim_vfs_disable(ns);
+	free_netdev(ns->netdev);
+}
+
+struct device_type nsim_dev_type = {
+	.groups = nsim_dev_attr_groups,
+	.release = nsim_dev_release,
+};
+
+static int nsim_init(struct net_device *dev)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+	int err;
+
+	ns->netdev = dev;
+	ns->ddir = debugfs_create_dir(netdev_name(dev), nsim_ddir);
+
+	err = nsim_bpf_init(ns);
+	if (err)
+		goto err_debugfs_destroy;
+
+	ns->dev.id = nsim_dev_id++;
+	ns->dev.bus = &nsim_bus;
+	ns->dev.type = &nsim_dev_type;
+	err = device_register(&ns->dev);
+	if (err)
+		goto err_bpf_uninit;
+
+	SET_NETDEV_DEV(dev, &ns->dev);
+
+	return 0;
+
+err_bpf_uninit:
+	nsim_bpf_uninit(ns);
+err_debugfs_destroy:
+	debugfs_remove_recursive(ns->ddir);
+	return err;
+}
+
+static void nsim_uninit(struct net_device *dev)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	debugfs_remove_recursive(ns->ddir);
+	nsim_bpf_uninit(ns);
+}
+
+static void nsim_free(struct net_device *dev)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	device_unregister(&ns->dev);
+	/* netdev and vf state will be freed out of device_release() */
+}
+
+static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	u64_stats_update_begin(&ns->syncp);
+	ns->tx_packets++;
+	ns->tx_bytes += skb->len;
+	u64_stats_update_end(&ns->syncp);
+
+	dev_kfree_skb(skb);
+
+	return NETDEV_TX_OK;
+}
+
+static void nsim_set_rx_mode(struct net_device *dev)
+{
+}
+
+static int nsim_change_mtu(struct net_device *dev, int new_mtu)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (ns->xdp_prog_mode == XDP_ATTACHED_DRV &&
+	    new_mtu > NSIM_XDP_MAX_MTU)
+		return -EBUSY;
+
+	dev->mtu = new_mtu;
+
+	return 0;
+}
+
+static void
+nsim_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+	unsigned int start;
+
+	do {
+		start = u64_stats_fetch_begin(&ns->syncp);
+		stats->tx_bytes = ns->tx_bytes;
+		stats->tx_packets = ns->tx_packets;
+	} while (u64_stats_fetch_retry(&ns->syncp, start));
+}
+
+static int
+nsim_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv)
+{
+	return nsim_bpf_setup_tc_block_cb(type, type_data, cb_priv);
+}
+
+static int
+nsim_setup_tc_block(struct net_device *dev, struct tc_block_offload *f)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (f->binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
+		return -EOPNOTSUPP;
+
+	switch (f->command) {
+	case TC_BLOCK_BIND:
+		return tcf_block_cb_register(f->block, nsim_setup_tc_block_cb,
+					     ns, ns);
+	case TC_BLOCK_UNBIND:
+		tcf_block_cb_unregister(f->block, nsim_setup_tc_block_cb, ns);
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int nsim_set_vf_mac(struct net_device *dev, int vf, u8 *mac)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	/* Only refuse multicast addresses, zero address can mean unset/any. */
+	if (vf >= ns->num_vfs || is_multicast_ether_addr(mac))
+		return -EINVAL;
+	memcpy(ns->vfconfigs[vf].vf_mac, mac, ETH_ALEN);
+
+	return 0;
+}
+
+static int nsim_set_vf_vlan(struct net_device *dev, int vf,
+			    u16 vlan, u8 qos, __be16 vlan_proto)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs || vlan > 4095 || qos > 7)
+		return -EINVAL;
+
+	ns->vfconfigs[vf].vlan = vlan;
+	ns->vfconfigs[vf].qos = qos;
+	ns->vfconfigs[vf].vlan_proto = vlan_proto;
+
+	return 0;
+}
+
+static int nsim_set_vf_rate(struct net_device *dev, int vf, int min, int max)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+
+	ns->vfconfigs[vf].min_tx_rate = min;
+	ns->vfconfigs[vf].max_tx_rate = max;
+
+	return 0;
+}
+
+static int nsim_set_vf_spoofchk(struct net_device *dev, int vf, bool val)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+	ns->vfconfigs[vf].spoofchk_enabled = val;
+
+	return 0;
+}
+
+static int nsim_set_vf_rss_query_en(struct net_device *dev, int vf, bool val)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+	ns->vfconfigs[vf].rss_query_enabled = val;
+
+	return 0;
+}
+
+static int nsim_set_vf_trust(struct net_device *dev, int vf, bool val)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+	ns->vfconfigs[vf].trusted = val;
+
+	return 0;
+}
+
+static int
+nsim_get_vf_config(struct net_device *dev, int vf, struct ifla_vf_info *ivi)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+
+	ivi->vf = vf;
+	ivi->linkstate = ns->vfconfigs[vf].link_state;
+	ivi->min_tx_rate = ns->vfconfigs[vf].min_tx_rate;
+	ivi->max_tx_rate = ns->vfconfigs[vf].max_tx_rate;
+	ivi->vlan = ns->vfconfigs[vf].vlan;
+	ivi->vlan_proto = ns->vfconfigs[vf].vlan_proto;
+	ivi->qos = ns->vfconfigs[vf].qos;
+	memcpy(&ivi->mac, ns->vfconfigs[vf].vf_mac, ETH_ALEN);
+	ivi->spoofchk = ns->vfconfigs[vf].spoofchk_enabled;
+	ivi->trusted = ns->vfconfigs[vf].trusted;
+	ivi->rss_query_en = ns->vfconfigs[vf].rss_query_enabled;
+
+	return 0;
+}
+
+static int nsim_set_vf_link_state(struct net_device *dev, int vf, int state)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if (vf >= ns->num_vfs)
+		return -EINVAL;
+
+	switch (state) {
+	case IFLA_VF_LINK_STATE_AUTO:
+	case IFLA_VF_LINK_STATE_ENABLE:
+	case IFLA_VF_LINK_STATE_DISABLE:
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	ns->vfconfigs[vf].link_state = state;
+
+	return 0;
+}
+
+static int
+nsim_setup_tc(struct net_device *dev, enum tc_setup_type type, void *type_data)
+{
+	switch (type) {
+	case TC_SETUP_BLOCK:
+		return nsim_setup_tc_block(dev, type_data);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int
+nsim_set_features(struct net_device *dev, netdev_features_t features)
+{
+	struct netdevsim *ns = netdev_priv(dev);
+
+	if ((dev->features & NETIF_F_HW_TC) > (features & NETIF_F_HW_TC))
+		return nsim_bpf_disable_tc(ns);
+
+	return 0;
+}
+
+static const struct net_device_ops nsim_netdev_ops = {
+	.ndo_init		= nsim_init,
+	.ndo_uninit		= nsim_uninit,
+	.ndo_start_xmit		= nsim_start_xmit,
+	.ndo_set_rx_mode	= nsim_set_rx_mode,
+	.ndo_set_mac_address	= eth_mac_addr,
+	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_change_mtu		= nsim_change_mtu,
+	.ndo_get_stats64	= nsim_get_stats64,
+	.ndo_set_vf_mac		= nsim_set_vf_mac,
+	.ndo_set_vf_vlan	= nsim_set_vf_vlan,
+	.ndo_set_vf_rate	= nsim_set_vf_rate,
+	.ndo_set_vf_spoofchk	= nsim_set_vf_spoofchk,
+	.ndo_set_vf_trust	= nsim_set_vf_trust,
+	.ndo_get_vf_config	= nsim_get_vf_config,
+	.ndo_set_vf_link_state	= nsim_set_vf_link_state,
+	.ndo_set_vf_rss_query_en = nsim_set_vf_rss_query_en,
+	.ndo_setup_tc		= nsim_setup_tc,
+	.ndo_set_features	= nsim_set_features,
+	.ndo_bpf		= nsim_bpf,
+};
+
+static void nsim_setup(struct net_device *dev)
+{
+	ether_setup(dev);
+	eth_hw_addr_random(dev);
+
+	dev->netdev_ops = &nsim_netdev_ops;
+	dev->priv_destructor = nsim_free;
+
+	dev->tx_queue_len = 0;
+	dev->flags |= IFF_NOARP;
+	dev->flags &= ~IFF_MULTICAST;
+	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE |
+			   IFF_NO_QUEUE;
+	dev->features |= NETIF_F_HIGHDMA |
+			 NETIF_F_SG |
+			 NETIF_F_FRAGLIST |
+			 NETIF_F_HW_CSUM |
+			 NETIF_F_TSO;
+	dev->hw_features |= NETIF_F_HW_TC;
+	dev->max_mtu = ETH_MAX_MTU;
+}
+
+static int nsim_validate(struct nlattr *tb[], struct nlattr *data[],
+			 struct netlink_ext_ack *extack)
+{
+	if (tb[IFLA_ADDRESS]) {
+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
+			return -EINVAL;
+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
+			return -EADDRNOTAVAIL;
+	}
+	return 0;
+}
+
+static struct rtnl_link_ops nsim_link_ops __read_mostly = {
+	.kind		= DRV_NAME,
+	.priv_size	= sizeof(struct netdevsim),
+	.setup		= nsim_setup,
+	.validate	= nsim_validate,
+};
+
+struct dentry *nsim_ddir;
+
+static int __init nsim_module_init(void)
+{
+	int err;
+
+	nsim_ddir = debugfs_create_dir(DRV_NAME, NULL);
+	if (IS_ERR(nsim_ddir))
+		return PTR_ERR(nsim_ddir);
+
+	err = bus_register(&nsim_bus);
+	if (err)
+		goto err_debugfs_destroy;
+
+	err = rtnl_link_register(&nsim_link_ops);
+	if (err)
+		goto err_unreg_bus;
+
+	return 0;
+
+err_unreg_bus:
+	bus_unregister(&nsim_bus);
+err_debugfs_destroy:
+	debugfs_remove_recursive(nsim_ddir);
+	return err;
+}
+
+static void __exit nsim_module_exit(void)
+{
+	rtnl_link_unregister(&nsim_link_ops);
+	bus_unregister(&nsim_bus);
+	debugfs_remove_recursive(nsim_ddir);
+}
+
+module_init(nsim_module_init);
+module_exit(nsim_module_exit);
+MODULE_LICENSE("GPL");
+MODULE_ALIAS_RTNL_LINK(DRV_NAME);
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
new file mode 100644
index 000000000000..32270de9395a
--- /dev/null
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -0,0 +1,78 @@
+/*
+ * Copyright (C) 2017 Netronome Systems, Inc.
+ *
+ * This software is licensed under the GNU General License Version 2,
+ * June 1991 as shown in the file COPYING in the top-level directory of this
+ * source tree.
+ *
+ * THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ * BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+ * OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+ * THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+ */
+
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/netdevice.h>
+#include <linux/u64_stats_sync.h>
+
+#define DRV_NAME	"netdevsim"
+
+#define NSIM_XDP_MAX_MTU	4000
+
+#define NSIM_EA(extack, msg)	NL_SET_ERR_MSG_MOD((extack), msg)
+
+struct bpf_prog;
+struct dentry;
+struct nsim_vf_config;
+
+struct netdevsim {
+	struct net_device *netdev;
+
+	u64 tx_packets;
+	u64 tx_bytes;
+	struct u64_stats_sync syncp;
+
+	struct device dev;
+
+	struct dentry *ddir;
+
+	unsigned int num_vfs;
+	struct nsim_vf_config *vfconfigs;
+
+	struct bpf_prog	*bpf_offloaded;
+	u32 bpf_offloaded_id;
+
+	u32 xdp_flags;
+	int xdp_prog_mode;
+	struct bpf_prog	*xdp_prog;
+
+	u32 prog_id_gen;
+
+	bool bpf_bind_accept;
+	u32 bpf_bind_verifier_delay;
+	struct dentry *ddir_bpf_bound_progs;
+	struct list_head bpf_bound_progs;
+
+	bool bpf_tc_accept;
+	bool bpf_tc_non_bound_accept;
+	bool bpf_xdpdrv_accept;
+	bool bpf_xdpoffload_accept;
+};
+
+extern struct dentry *nsim_ddir;
+
+int nsim_bpf_init(struct netdevsim *ns);
+void nsim_bpf_uninit(struct netdevsim *ns);
+int nsim_bpf(struct net_device *dev, struct netdev_bpf *bpf);
+int nsim_bpf_disable_tc(struct netdevsim *ns);
+int nsim_bpf_setup_tc_block_cb(enum tc_setup_type type,
+			       void *type_data, void *cb_priv);
+
+static inline struct netdevsim *to_nsim(struct device *ptr)
+{
+	return container_of(ptr, struct netdevsim, dev);
+}
diff --git a/drivers/net/phy/amd.c b/drivers/net/phy/amd.c
index 18141c022b13..6fe5dc9201d0 100644
--- a/drivers/net/phy/amd.c
+++ b/drivers/net/phy/amd.c
@@ -68,8 +68,6 @@ static struct phy_driver am79c_driver[] = { {
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= am79c_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= am79c_ack_interrupt,
 	.config_intr	= am79c_config_intr,
 } };
diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index 5f93e6add563..29da7a3c7a37 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -71,7 +71,6 @@ MODULE_LICENSE("GPL");
 
 struct at803x_priv {
 	bool phy_reset:1;
-	struct gpio_desc *gpiod_reset;
 };
 
 struct at803x_context {
@@ -254,22 +253,11 @@ static int at803x_probe(struct phy_device *phydev)
 {
 	struct device *dev = &phydev->mdio.dev;
 	struct at803x_priv *priv;
-	struct gpio_desc *gpiod_reset;
 
 	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
 	if (!priv)
 		return -ENOMEM;
 
-	if (phydev->drv->phy_id != ATH8030_PHY_ID)
-		goto does_not_require_reset_workaround;
-
-	gpiod_reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
-	if (IS_ERR(gpiod_reset))
-		return PTR_ERR(gpiod_reset);
-
-	priv->gpiod_reset = gpiod_reset;
-
-does_not_require_reset_workaround:
 	phydev->priv = priv;
 
 	return 0;
@@ -343,14 +331,14 @@ static void at803x_link_change_notify(struct phy_device *phydev)
 	 * cannot recover from by software.
 	 */
 	if (phydev->state == PHY_NOLINK) {
-		if (priv->gpiod_reset && !priv->phy_reset) {
+		if (phydev->mdio.reset && !priv->phy_reset) {
 			struct at803x_context context;
 
 			at803x_context_save(phydev, &context);
 
-			gpiod_set_value(priv->gpiod_reset, 1);
+			phy_device_reset(phydev, 1);
 			msleep(1);
-			gpiod_set_value(priv->gpiod_reset, 0);
+			phy_device_reset(phydev, 0);
 			msleep(1);
 
 			at803x_context_restore(phydev, &context);
@@ -408,8 +396,6 @@ static struct phy_driver at803x_driver[] = {
 	.resume			= at803x_resume,
 	.features		= PHY_GBIT_FEATURES,
 	.flags			= PHY_HAS_INTERRUPT,
-	.config_aneg		= genphy_config_aneg,
-	.read_status		= genphy_read_status,
 	.ack_interrupt		= at803x_ack_interrupt,
 	.config_intr		= at803x_config_intr,
 }, {
@@ -426,8 +412,6 @@ static struct phy_driver at803x_driver[] = {
 	.resume			= at803x_resume,
 	.features		= PHY_BASIC_FEATURES,
 	.flags			= PHY_HAS_INTERRUPT,
-	.config_aneg		= genphy_config_aneg,
-	.read_status		= genphy_read_status,
 	.ack_interrupt		= at803x_ack_interrupt,
 	.config_intr		= at803x_config_intr,
 }, {
@@ -443,8 +427,6 @@ static struct phy_driver at803x_driver[] = {
 	.resume			= at803x_resume,
 	.features		= PHY_GBIT_FEATURES,
 	.flags			= PHY_HAS_INTERRUPT,
-	.config_aneg		= genphy_config_aneg,
-	.read_status		= genphy_read_status,
 	.aneg_done		= at803x_aneg_done,
 	.ack_interrupt		= &at803x_ack_interrupt,
 	.config_intr		= &at803x_config_intr,
diff --git a/drivers/net/phy/bcm-cygnus.c b/drivers/net/phy/bcm-cygnus.c
index 3fe8cc5c177e..6838129839ca 100644
--- a/drivers/net/phy/bcm-cygnus.c
+++ b/drivers/net/phy/bcm-cygnus.c
@@ -136,8 +136,6 @@ static struct phy_driver bcm_cygnus_phy_driver[] = {
 	.name          = "Broadcom Cygnus PHY",
 	.features      = PHY_GBIT_FEATURES,
 	.config_init   = bcm_cygnus_config_init,
-	.config_aneg   = genphy_config_aneg,
-	.read_status   = genphy_read_status,
 	.ack_interrupt = bcm_phy_ack_intr,
 	.config_intr   = bcm_phy_config_intr,
 	.suspend       = genphy_suspend,
diff --git a/drivers/net/phy/bcm63xx.c b/drivers/net/phy/bcm63xx.c
index b0492ef2cdaa..cf14613745c9 100644
--- a/drivers/net/phy/bcm63xx.c
+++ b/drivers/net/phy/bcm63xx.c
@@ -69,8 +69,6 @@ static struct phy_driver bcm63xx_driver[] = {
 	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
 	.flags		= PHY_HAS_INTERRUPT | PHY_IS_INTERNAL,
 	.config_init	= bcm63xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm63xx_config_intr,
 }, {
@@ -81,8 +79,6 @@ static struct phy_driver bcm63xx_driver[] = {
 	.features	= (PHY_BASIC_FEATURES | SUPPORTED_Pause),
 	.flags		= PHY_HAS_INTERRUPT | PHY_IS_INTERNAL,
 	.config_init	= bcm63xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm63xx_config_intr,
 } };
diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c
index 8b33f688ac8a..421feb8f92fe 100644
--- a/drivers/net/phy/bcm7xxx.c
+++ b/drivers/net/phy/bcm7xxx.c
@@ -611,8 +611,6 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev)
 	.features	= PHY_GBIT_FEATURES,				\
 	.flags		= PHY_IS_INTERNAL,				\
 	.config_init	= bcm7xxx_28nm_config_init,			\
-	.config_aneg	= genphy_config_aneg,				\
-	.read_status	= genphy_read_status,				\
 	.resume		= bcm7xxx_28nm_resume,				\
 	.get_tunable	= bcm7xxx_28nm_get_tunable,			\
 	.set_tunable	= bcm7xxx_28nm_set_tunable,			\
@@ -630,8 +628,6 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev)
 	.features	= PHY_BASIC_FEATURES,				\
 	.flags		= PHY_IS_INTERNAL,				\
 	.config_init	= bcm7xxx_28nm_ephy_config_init,		\
-	.config_aneg	= genphy_config_aneg,				\
-	.read_status	= genphy_read_status,				\
 	.resume		= bcm7xxx_28nm_ephy_resume,			\
 	.get_sset_count	= bcm_phy_get_sset_count,			\
 	.get_strings	= bcm_phy_get_strings,				\
@@ -647,8 +643,6 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev)
 	.features       = PHY_BASIC_FEATURES,				\
 	.flags          = PHY_IS_INTERNAL,				\
 	.config_init    = bcm7xxx_config_init,				\
-	.config_aneg    = genphy_config_aneg,				\
-	.read_status    = genphy_read_status,				\
 	.suspend        = bcm7xxx_suspend,				\
 	.resume         = bcm7xxx_config_init,				\
 }
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c
index d7ed69deabfb..a8f69c5777bc 100644
--- a/drivers/net/phy/broadcom.c
+++ b/drivers/net/phy/broadcom.c
@@ -548,8 +548,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -559,8 +557,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -570,8 +566,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -581,8 +575,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -592,8 +584,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -603,8 +593,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -614,8 +602,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -626,7 +612,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
 	.config_aneg	= bcm5481_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -637,7 +622,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.flags          = PHY_HAS_INTERRUPT,
 	.config_init    = bcm54xx_config_init,
 	.config_aneg    = bcm5481_config_aneg,
-	.read_status    = genphy_read_status,
 	.ack_interrupt  = bcm_phy_ack_intr,
 	.config_intr    = bcm_phy_config_intr,
 }, {
@@ -647,7 +631,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm5482_config_init,
-	.config_aneg	= genphy_config_aneg,
 	.read_status	= bcm5482_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
@@ -658,8 +641,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -669,8 +650,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -680,8 +659,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= bcm54xx_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= bcm_phy_ack_intr,
 	.config_intr	= bcm_phy_config_intr,
 }, {
@@ -691,8 +668,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= brcm_fet_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= brcm_fet_ack_interrupt,
 	.config_intr	= brcm_fet_config_intr,
 }, {
@@ -702,8 +677,6 @@ static struct phy_driver broadcom_drivers[] = {
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= brcm_fet_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= brcm_fet_ack_interrupt,
 	.config_intr	= brcm_fet_config_intr,
 } };
diff --git a/drivers/net/phy/cicada.c b/drivers/net/phy/cicada.c
index d339c1afea77..c05af00bf4b6 100644
--- a/drivers/net/phy/cicada.c
+++ b/drivers/net/phy/cicada.c
@@ -110,8 +110,6 @@ static struct phy_driver cis820x_driver[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= &cis820x_config_init,
-	.config_aneg	= &genphy_config_aneg,
-	.read_status	= &genphy_read_status,
 	.ack_interrupt	= &cis820x_ack_interrupt,
 	.config_intr	= &cis820x_config_intr,
 }, {
@@ -121,8 +119,6 @@ static struct phy_driver cis820x_driver[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= &cis820x_config_init,
-	.config_aneg	= &genphy_config_aneg,
-	.read_status	= &genphy_read_status,
 	.ack_interrupt	= &cis820x_ack_interrupt,
 	.config_intr	= &cis820x_config_intr,
 } };
diff --git a/drivers/net/phy/davicom.c b/drivers/net/phy/davicom.c
index e28913d9ea7e..5ee99b3b428c 100644
--- a/drivers/net/phy/davicom.c
+++ b/drivers/net/phy/davicom.c
@@ -153,7 +153,6 @@ static struct phy_driver dm91xx_driver[] = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= dm9161_config_init,
 	.config_aneg	= dm9161_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= dm9161_ack_interrupt,
 	.config_intr	= dm9161_config_intr,
 }, {
@@ -164,7 +163,6 @@ static struct phy_driver dm91xx_driver[] = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= dm9161_config_init,
 	.config_aneg	= dm9161_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= dm9161_ack_interrupt,
 	.config_intr	= dm9161_config_intr,
 }, {
@@ -175,7 +173,6 @@ static struct phy_driver dm91xx_driver[] = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= dm9161_config_init,
 	.config_aneg	= dm9161_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= dm9161_ack_interrupt,
 	.config_intr	= dm9161_config_intr,
 }, {
@@ -184,8 +181,6 @@ static struct phy_driver dm91xx_driver[] = {
 	.phy_id_mask	= 0x0ffffff0,
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= dm9161_ack_interrupt,
 	.config_intr	= dm9161_config_intr,
 } };
diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index cbd629822f04..654f42d00092 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1502,8 +1502,6 @@ static struct phy_driver dp83640_driver = {
 	.probe		= dp83640_probe,
 	.remove		= dp83640_remove,
 	.config_init	= dp83640_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt  = dp83640_ack_interrupt,
 	.config_intr    = dp83640_config_intr,
 	.ts_info	= dp83640_ts_info,
diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c
index 14335d14e9e4..6e8a2a4f3a6e 100644
--- a/drivers/net/phy/dp83822.c
+++ b/drivers/net/phy/dp83822.c
@@ -325,8 +325,6 @@ static struct phy_driver dp83822_driver[] = {
 		.set_wol = dp83822_set_wol,
 		.ack_interrupt = dp83822_ack_interrupt,
 		.config_intr = dp83822_config_intr,
-		.config_aneg = genphy_config_aneg,
-		.read_status = genphy_read_status,
 		.suspend = dp83822_suspend,
 		.resume = dp83822_resume,
 	 },
diff --git a/drivers/net/phy/dp83848.c b/drivers/net/phy/dp83848.c
index 3966d43c5146..cd09c3af2117 100644
--- a/drivers/net/phy/dp83848.c
+++ b/drivers/net/phy/dp83848.c
@@ -95,8 +95,6 @@ MODULE_DEVICE_TABLE(mdio, dp83848_tbl);
 		.config_init	= genphy_config_init,		\
 		.suspend	= genphy_suspend,		\
 		.resume		= genphy_resume,		\
-		.config_aneg	= genphy_config_aneg,		\
-		.read_status	= genphy_read_status,		\
 								\
 		/* IRQ related */				\
 		.ack_interrupt	= dp83848_ack_interrupt,	\
diff --git a/drivers/net/phy/dp83867.c b/drivers/net/phy/dp83867.c
index c1ab976cc800..ab58224f897f 100644
--- a/drivers/net/phy/dp83867.c
+++ b/drivers/net/phy/dp83867.c
@@ -324,8 +324,6 @@ static struct phy_driver dp83867_driver[] = {
 		.ack_interrupt	= dp83867_ack_interrupt,
 		.config_intr	= dp83867_config_intr,
 
-		.config_aneg	= genphy_config_aneg,
-		.read_status	= genphy_read_status,
 		.suspend	= genphy_suspend,
 		.resume		= genphy_resume,
 	},
diff --git a/drivers/net/phy/icplus.c b/drivers/net/phy/icplus.c
index 567280a72241..791587a49215 100644
--- a/drivers/net/phy/icplus.c
+++ b/drivers/net/phy/icplus.c
@@ -227,8 +227,6 @@ static struct phy_driver icplus_driver[] = {
 	.phy_id_mask	= 0x0ffffff0,
 	.features	= PHY_GBIT_FEATURES,
 	.config_init	= &ip1001_config_init,
-	.config_aneg	= &genphy_config_aneg,
-	.read_status	= &genphy_read_status,
 	.suspend	= genphy_suspend,
 	.resume		= genphy_resume,
 }, {
@@ -239,8 +237,6 @@ static struct phy_driver icplus_driver[] = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.ack_interrupt	= ip101a_g_ack_interrupt,
 	.config_init	= &ip101a_g_config_init,
-	.config_aneg	= &genphy_config_aneg,
-	.read_status	= &genphy_read_status,
 	.suspend	= genphy_suspend,
 	.resume		= genphy_resume,
 } };
diff --git a/drivers/net/phy/intel-xway.c b/drivers/net/phy/intel-xway.c
index 55f8c52dd2f1..a11f80cb5388 100644
--- a/drivers/net/phy/intel-xway.c
+++ b/drivers/net/phy/intel-xway.c
@@ -243,7 +243,6 @@ static struct phy_driver xway_gphy[] = {
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= xway_gphy14_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= xway_gphy_ack_interrupt,
 		.did_interrupt	= xway_gphy_did_interrupt,
 		.config_intr	= xway_gphy_config_intr,
@@ -257,7 +256,6 @@ static struct phy_driver xway_gphy[] = {
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= xway_gphy14_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= xway_gphy_ack_interrupt,
 		.did_interrupt	= xway_gphy_did_interrupt,
 		.config_intr	= xway_gphy_config_intr,
@@ -271,7 +269,6 @@ static struct phy_driver xway_gphy[] = {
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= xway_gphy14_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= xway_gphy_ack_interrupt,
 		.did_interrupt	= xway_gphy_did_interrupt,
 		.config_intr	= xway_gphy_config_intr,
@@ -285,7 +282,6 @@ static struct phy_driver xway_gphy[] = {
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
 		.config_aneg	= xway_gphy14_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= xway_gphy_ack_interrupt,
 		.did_interrupt	= xway_gphy_did_interrupt,
 		.config_intr	= xway_gphy_config_intr,
@@ -298,8 +294,6 @@ static struct phy_driver xway_gphy[] = {
 		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
-		.config_aneg	= genphy_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= xway_gphy_ack_interrupt,
 		.did_interrupt	= xway_gphy_did_interrupt,
 		.config_intr	= xway_gphy_config_intr,
@@ -312,8 +306,6 @@ static struct phy_driver xway_gphy[] = {
 		.features	= PHY_BASIC_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
-		.config_aneg	= genphy_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= xway_gphy_ack_interrupt,
 		.did_interrupt	= xway_gphy_did_interrupt,
 		.config_intr	= xway_gphy_config_intr,
@@ -326,8 +318,6 @@ static struct phy_driver xway_gphy[] = {
 		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
-		.config_aneg	= genphy_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= xway_gphy_ack_interrupt,
 		.did_interrupt	= xway_gphy_did_interrupt,
 		.config_intr	= xway_gphy_config_intr,
@@ -340,8 +330,6 @@ static struct phy_driver xway_gphy[] = {
 		.features	= PHY_BASIC_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
 		.config_init	= xway_gphy_config_init,
-		.config_aneg	= genphy_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= xway_gphy_ack_interrupt,
 		.did_interrupt	= xway_gphy_did_interrupt,
 		.config_intr	= xway_gphy_config_intr,
diff --git a/drivers/net/phy/lxt.c b/drivers/net/phy/lxt.c
index 09d215177fff..c14b254b2879 100644
--- a/drivers/net/phy/lxt.c
+++ b/drivers/net/phy/lxt.c
@@ -259,8 +259,6 @@ static struct phy_driver lxt97x_driver[] = {
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= lxt970_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= lxt970_ack_interrupt,
 	.config_intr	= lxt970_config_intr,
 }, {
@@ -269,8 +267,6 @@ static struct phy_driver lxt97x_driver[] = {
 	.phy_id_mask	= 0xfffffff0,
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= lxt971_ack_interrupt,
 	.config_intr	= lxt971_config_intr,
 }, {
@@ -290,7 +286,6 @@ static struct phy_driver lxt97x_driver[] = {
 	.flags		= 0,
 	.probe		= lxt973_probe,
 	.config_aneg	= lxt973_config_aneg,
-	.read_status	= genphy_read_status,
 } };
 
 module_phy_driver(lxt97x_driver);
diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 4d02b27df044..6dbb0f4c34eb 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -1958,7 +1958,6 @@ static struct phy_driver marvell_drivers[] = {
 		.probe = marvell_probe,
 		.config_init = &marvell_config_init,
 		.config_aneg = &m88e1101_config_aneg,
-		.read_status = &genphy_read_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.resume = &genphy_resume,
@@ -1976,7 +1975,6 @@ static struct phy_driver marvell_drivers[] = {
 		.probe = marvell_probe,
 		.config_init = &m88e1111_config_init,
 		.config_aneg = &marvell_config_aneg,
-		.read_status = &genphy_read_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.resume = &genphy_resume,
@@ -2012,7 +2010,6 @@ static struct phy_driver marvell_drivers[] = {
 		.probe = marvell_probe,
 		.config_init = &m88e1118_config_init,
 		.config_aneg = &m88e1118_config_aneg,
-		.read_status = &genphy_read_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.resume = &genphy_resume,
@@ -2070,7 +2067,6 @@ static struct phy_driver marvell_drivers[] = {
 		.probe = marvell_probe,
 		.config_init = &m88e1145_config_init,
 		.config_aneg = &marvell_config_aneg,
-		.read_status = &genphy_read_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.resume = &genphy_resume,
@@ -2088,7 +2084,6 @@ static struct phy_driver marvell_drivers[] = {
 		.probe = marvell_probe,
 		.config_init = &m88e1149_config_init,
 		.config_aneg = &m88e1118_config_aneg,
-		.read_status = &genphy_read_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.resume = &genphy_resume,
@@ -2106,7 +2101,6 @@ static struct phy_driver marvell_drivers[] = {
 		.probe = marvell_probe,
 		.config_init = &m88e1111_config_init,
 		.config_aneg = &marvell_config_aneg,
-		.read_status = &genphy_read_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.resume = &genphy_resume,
@@ -2123,8 +2117,6 @@ static struct phy_driver marvell_drivers[] = {
 		.flags = PHY_HAS_INTERRUPT,
 		.probe = marvell_probe,
 		.config_init = &m88e1116r_config_init,
-		.config_aneg = &genphy_config_aneg,
-		.read_status = &genphy_read_status,
 		.ack_interrupt = &marvell_ack_interrupt,
 		.config_intr = &marvell_config_intr,
 		.resume = &genphy_resume,
@@ -2200,7 +2192,6 @@ static struct phy_driver marvell_drivers[] = {
 		.features = PHY_BASIC_FEATURES,
 		.flags = PHY_HAS_INTERRUPT,
 		.probe = marvell_probe,
-		.config_aneg = &genphy_config_aneg,
 		.config_init = &m88e3016_config_init,
 		.aneg_done = &marvell_aneg_done,
 		.read_status = &marvell_read_status,
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 2df7b62c1a36..8f8b7747c54b 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -38,6 +38,7 @@
 #include <linux/phy.h>
 #include <linux/io.h>
 #include <linux/uaccess.h>
+#include <linux/gpio/consumer.h>
 
 #include <asm/irq.h>
 
@@ -48,9 +49,26 @@
 
 int mdiobus_register_device(struct mdio_device *mdiodev)
 {
+	struct gpio_desc *gpiod = NULL;
+
 	if (mdiodev->bus->mdio_map[mdiodev->addr])
 		return -EBUSY;
 
+	/* Deassert the optional reset signal */
+	if (mdiodev->dev.of_node)
+		gpiod = fwnode_get_named_gpiod(&mdiodev->dev.of_node->fwnode,
+					       "reset-gpios", 0, GPIOD_OUT_LOW,
+					       "PHY reset");
+	if (PTR_ERR(gpiod) == -ENOENT)
+		gpiod = NULL;
+	else if (IS_ERR(gpiod))
+		return PTR_ERR(gpiod);
+
+	mdiodev->reset = gpiod;
+
+	/* Assert the reset signal again */
+	mdio_device_reset(mdiodev, 1);
+
 	mdiodev->bus->mdio_map[mdiodev->addr] = mdiodev;
 
 	return 0;
@@ -420,6 +438,9 @@ void mdiobus_unregister(struct mii_bus *bus)
 		if (!mdiodev)
 			continue;
 
+		if (mdiodev->reset)
+			gpiod_put(mdiodev->reset);
+
 		mdiodev->device_remove(mdiodev);
 		mdiodev->device_free(mdiodev);
 	}
diff --git a/drivers/net/phy/mdio_device.c b/drivers/net/phy/mdio_device.c
index e24f28924af8..75d97dd9fb28 100644
--- a/drivers/net/phy/mdio_device.c
+++ b/drivers/net/phy/mdio_device.c
@@ -12,6 +12,8 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/errno.h>
+#include <linux/gpio.h>
+#include <linux/gpio/consumer.h>
 #include <linux/init.h>
 #include <linux/interrupt.h>
 #include <linux/kernel.h>
@@ -114,6 +116,13 @@ void mdio_device_remove(struct mdio_device *mdiodev)
 }
 EXPORT_SYMBOL(mdio_device_remove);
 
+void mdio_device_reset(struct mdio_device *mdiodev, int value)
+{
+	if (mdiodev->reset)
+		gpiod_set_value(mdiodev->reset, value);
+}
+EXPORT_SYMBOL(mdio_device_reset);
+
 /**
  * mdio_probe - probe an MDIO device
  * @dev: device to probe
@@ -128,8 +137,16 @@ static int mdio_probe(struct device *dev)
 	struct mdio_driver *mdiodrv = to_mdio_driver(drv);
 	int err = 0;
 
-	if (mdiodrv->probe)
+	if (mdiodrv->probe) {
+		/* Deassert the reset signal */
+		mdio_device_reset(mdiodev, 0);
+
 		err = mdiodrv->probe(mdiodev);
+		if (err) {
+			/* Assert the reset signal */
+			mdio_device_reset(mdiodev, 1);
+		}
+	}
 
 	return err;
 }
@@ -140,9 +157,13 @@ static int mdio_remove(struct device *dev)
 	struct device_driver *drv = mdiodev->dev.driver;
 	struct mdio_driver *mdiodrv = to_mdio_driver(drv);
 
-	if (mdiodrv->remove)
+	if (mdiodrv->remove) {
 		mdiodrv->remove(mdiodev);
 
+		/* Assert the reset signal */
+		mdio_device_reset(mdiodev, 1);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/net/phy/meson-gxl.c b/drivers/net/phy/meson-gxl.c
index 1ea69b7585d9..401e3234be58 100644
--- a/drivers/net/phy/meson-gxl.c
+++ b/drivers/net/phy/meson-gxl.c
@@ -58,9 +58,7 @@ static struct phy_driver meson_gxl_phy[] = {
 		.features	= PHY_BASIC_FEATURES,
 		.flags		= PHY_IS_INTERNAL,
 		.config_init	= meson_gxl_config_init,
-		.config_aneg	= genphy_config_aneg,
 		.aneg_done      = genphy_aneg_done,
-		.read_status	= genphy_read_status,
 		.suspend        = genphy_suspend,
 		.resume         = genphy_resume,
 	},
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index ab4614113403..fd500b18e77f 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -801,8 +801,6 @@ static struct phy_driver ksphy_driver[] = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.driver_data	= &ks8737_type,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.suspend	= genphy_suspend,
@@ -816,8 +814,6 @@ static struct phy_driver ksphy_driver[] = {
 	.driver_data	= &ksz8021_type,
 	.probe		= kszphy_probe,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.get_sset_count = kszphy_get_sset_count,
@@ -834,8 +830,6 @@ static struct phy_driver ksphy_driver[] = {
 	.driver_data	= &ksz8021_type,
 	.probe		= kszphy_probe,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.get_sset_count = kszphy_get_sset_count,
@@ -853,7 +847,6 @@ static struct phy_driver ksphy_driver[] = {
 	.probe		= kszphy_probe,
 	.config_init	= ksz8041_config_init,
 	.config_aneg	= ksz8041_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.get_sset_count = kszphy_get_sset_count,
@@ -870,8 +863,6 @@ static struct phy_driver ksphy_driver[] = {
 	.driver_data	= &ksz8041_type,
 	.probe		= kszphy_probe,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.get_sset_count = kszphy_get_sset_count,
@@ -888,8 +879,6 @@ static struct phy_driver ksphy_driver[] = {
 	.driver_data	= &ksz8051_type,
 	.probe		= kszphy_probe,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.get_sset_count = kszphy_get_sset_count,
@@ -906,8 +895,6 @@ static struct phy_driver ksphy_driver[] = {
 	.driver_data	= &ksz8041_type,
 	.probe		= kszphy_probe,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.get_sset_count = kszphy_get_sset_count,
@@ -924,8 +911,6 @@ static struct phy_driver ksphy_driver[] = {
 	.driver_data	= &ksz8081_type,
 	.probe		= kszphy_probe,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.get_sset_count = kszphy_get_sset_count,
@@ -940,8 +925,6 @@ static struct phy_driver ksphy_driver[] = {
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.suspend	= genphy_suspend,
@@ -955,8 +938,6 @@ static struct phy_driver ksphy_driver[] = {
 	.driver_data	= &ksz9021_type,
 	.probe		= kszphy_probe,
 	.config_init	= ksz9021_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
 	.get_sset_count = kszphy_get_sset_count,
@@ -975,7 +956,6 @@ static struct phy_driver ksphy_driver[] = {
 	.driver_data	= &ksz9021_type,
 	.probe		= kszphy_probe,
 	.config_init	= ksz9031_config_init,
-	.config_aneg	= genphy_config_aneg,
 	.read_status	= ksz9031_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
 	.config_intr	= kszphy_config_intr,
@@ -1000,8 +980,6 @@ static struct phy_driver ksphy_driver[] = {
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.suspend	= genphy_suspend,
 	.resume		= genphy_resume,
 }, {
@@ -1021,8 +999,6 @@ static struct phy_driver ksphy_driver[] = {
 	.name		= "Microchip KSZ9477",
 	.features	= PHY_GBIT_FEATURES,
 	.config_init	= kszphy_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.suspend	= genphy_suspend,
 	.resume		= genphy_resume,
 } };
diff --git a/drivers/net/phy/microchip.c b/drivers/net/phy/microchip.c
index 37ee856c7680..0f293ef28935 100644
--- a/drivers/net/phy/microchip.c
+++ b/drivers/net/phy/microchip.c
@@ -153,7 +153,6 @@ static struct phy_driver microchip_phy_driver[] = {
 
 	.config_init	= genphy_config_init,
 	.config_aneg	= lan88xx_config_aneg,
-	.read_status	= genphy_read_status,
 
 	.ack_interrupt	= lan88xx_phy_ack_interrupt,
 	.config_intr	= lan88xx_phy_config_intr,
diff --git a/drivers/net/phy/national.c b/drivers/net/phy/national.c
index 2addf1d3f619..2b1e336961f9 100644
--- a/drivers/net/phy/national.c
+++ b/drivers/net/phy/national.c
@@ -136,8 +136,6 @@ static struct phy_driver dp83865_driver[] = { {
 	.features = PHY_GBIT_FEATURES,
 	.flags = PHY_HAS_INTERRUPT,
 	.config_init = ns_config_init,
-	.config_aneg = genphy_config_aneg,
-	.read_status = genphy_read_status,
 	.ack_interrupt = ns_ack_interrupt,
 	.config_intr = ns_config_intr,
 } };
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 2b1e67bc1e73..944143b521d7 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -493,7 +493,10 @@ static int phy_start_aneg_priv(struct phy_device *phydev, bool sync)
 	/* Invalidate LP advertising flags */
 	phydev->lp_advertising = 0;
 
-	err = phydev->drv->config_aneg(phydev);
+	if (phydev->drv->config_aneg)
+		err = phydev->drv->config_aneg(phydev);
+	else
+		err = genphy_config_aneg(phydev);
 	if (err < 0)
 		goto out_unlock;
 
@@ -629,9 +632,6 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
 	if (PHY_HALTED == phydev->state)
 		return IRQ_NONE;		/* It can't be ours.  */
 
-	disable_irq_nosync(irq);
-	atomic_inc(&phydev->irq_disable);
-
 	phy_change(phydev);
 
 	return IRQ_HANDLED;
@@ -689,7 +689,6 @@ phy_err:
  */
 int phy_start_interrupts(struct phy_device *phydev)
 {
-	atomic_set(&phydev->irq_disable, 0);
 	if (request_threaded_irq(phydev->irq, NULL, phy_interrupt,
 				 IRQF_ONESHOT | IRQF_SHARED,
 				 phydev_name(phydev), phydev) < 0) {
@@ -716,13 +715,6 @@ int phy_stop_interrupts(struct phy_device *phydev)
 
 	free_irq(phydev->irq, phydev);
 
-	/* If work indeed has been cancelled, disable_irq() will have
-	 * been left unbalanced from phy_interrupt() and enable_irq()
-	 * has to be called so that other devices on the line work.
-	 */
-	while (atomic_dec_return(&phydev->irq_disable) >= 0)
-		enable_irq(phydev->irq);
-
 	return err;
 }
 EXPORT_SYMBOL(phy_stop_interrupts);
@@ -736,10 +728,11 @@ void phy_change(struct phy_device *phydev)
 	if (phy_interrupt_is_valid(phydev)) {
 		if (phydev->drv->did_interrupt &&
 		    !phydev->drv->did_interrupt(phydev))
-			goto ignore;
+			return;
 
-		if (phy_disable_interrupts(phydev))
-			goto phy_err;
+		if (phydev->state == PHY_HALTED)
+			if (phy_disable_interrupts(phydev))
+				goto phy_err;
 	}
 
 	mutex_lock(&phydev->lock);
@@ -747,28 +740,13 @@ void phy_change(struct phy_device *phydev)
 		phydev->state = PHY_CHANGELINK;
 	mutex_unlock(&phydev->lock);
 
-	if (phy_interrupt_is_valid(phydev)) {
-		atomic_dec(&phydev->irq_disable);
-		enable_irq(phydev->irq);
-
-		/* Reenable interrupts */
-		if (PHY_HALTED != phydev->state &&
-		    phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED))
-			goto irq_enable_err;
-	}
-
 	/* reschedule state queue work to run as soon as possible */
 	phy_trigger_machine(phydev, true);
-	return;
 
-ignore:
-	atomic_dec(&phydev->irq_disable);
-	enable_irq(phydev->irq);
+	if (phy_interrupt_is_valid(phydev) && phy_clear_interrupt(phydev))
+		goto phy_err;
 	return;
 
-irq_enable_err:
-	disable_irq(phydev->irq);
-	atomic_inc(&phydev->irq_disable);
 phy_err:
 	phy_error(phydev);
 }
@@ -1006,10 +984,6 @@ void phy_state_machine(struct work_struct *work)
 			phydev->state = PHY_NOLINK;
 			phy_link_down(phydev, true);
 		}
-
-		if (phy_interrupt_is_valid(phydev))
-			err = phy_config_interrupt(phydev,
-						   PHY_INTERRUPT_ENABLED);
 		break;
 	case PHY_HALTED:
 		if (phydev->link) {
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 67f25ac29025..1de5e242b8b4 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -632,6 +632,9 @@ int phy_device_register(struct phy_device *phydev)
 	if (err)
 		return err;
 
+	/* Deassert the reset signal */
+	phy_device_reset(phydev, 0);
+
 	/* Run all of the fixups for this PHY */
 	err = phy_scan_fixups(phydev);
 	if (err) {
@@ -650,6 +653,9 @@ int phy_device_register(struct phy_device *phydev)
 	return 0;
 
  out:
+	/* Assert the reset signal */
+	phy_device_reset(phydev, 1);
+
 	mdiobus_unregister_device(&phydev->mdio);
 	return err;
 }
@@ -666,6 +672,10 @@ EXPORT_SYMBOL(phy_device_register);
 void phy_device_remove(struct phy_device *phydev)
 {
 	device_del(&phydev->mdio.dev);
+
+	/* Assert the reset signal */
+	phy_device_reset(phydev, 1);
+
 	mdiobus_unregister_device(&phydev->mdio);
 }
 EXPORT_SYMBOL(phy_device_remove);
@@ -849,6 +859,9 @@ int phy_init_hw(struct phy_device *phydev)
 {
 	int ret = 0;
 
+	/* Deassert the reset signal */
+	phy_device_reset(phydev, 0);
+
 	if (!phydev->drv || !phydev->drv->config_init)
 		return 0;
 
@@ -1126,6 +1139,9 @@ void phy_detach(struct phy_device *phydev)
 	put_device(&phydev->mdio.dev);
 	if (ndev_owner != bus->owner)
 		module_put(bus->owner);
+
+	/* Assert the reset signal */
+	phy_device_reset(phydev, 1);
 }
 EXPORT_SYMBOL(phy_detach);
 
@@ -1811,8 +1827,16 @@ static int phy_probe(struct device *dev)
 	/* Set the state to READY by default */
 	phydev->state = PHY_READY;
 
-	if (phydev->drv->probe)
+	if (phydev->drv->probe) {
+		/* Deassert the reset signal */
+		phy_device_reset(phydev, 0);
+
 		err = phydev->drv->probe(phydev);
+		if (err) {
+			/* Assert the reset signal */
+			phy_device_reset(phydev, 1);
+		}
+	}
 
 	mutex_unlock(&phydev->lock);
 
@@ -1829,8 +1853,12 @@ static int phy_remove(struct device *dev)
 	phydev->state = PHY_DOWN;
 	mutex_unlock(&phydev->lock);
 
-	if (phydev->drv && phydev->drv->remove)
+	if (phydev->drv && phydev->drv->remove) {
 		phydev->drv->remove(phydev);
+
+		/* Assert the reset signal */
+		phy_device_reset(phydev, 1);
+	}
 	phydev->drv = NULL;
 
 	return 0;
@@ -1907,9 +1935,7 @@ static struct phy_driver genphy_driver = {
 	.features	= PHY_GBIT_FEATURES | SUPPORTED_MII |
 			  SUPPORTED_AUI | SUPPORTED_FIBRE |
 			  SUPPORTED_BNC,
-	.config_aneg	= genphy_config_aneg,
 	.aneg_done	= genphy_aneg_done,
-	.read_status	= genphy_read_status,
 	.suspend	= genphy_suspend,
 	.resume		= genphy_resume,
 	.set_loopback   = genphy_loopback,
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 5dc9668dde34..2bfb548d3dff 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -36,7 +36,11 @@ enum {
 	PHYLINK_DISABLE_LINK,
 };
 
+/**
+ * struct phylink - internal data type for phylink
+ */
 struct phylink {
+	/* private: */
 	struct net_device *netdev;
 	const struct phylink_mac_ops *ops;
 
@@ -87,6 +91,13 @@ static inline bool linkmode_empty(const unsigned long *src)
 	return bitmap_empty(src, __ETHTOOL_LINK_MODE_MASK_NBITS);
 }
 
+/**
+ * phylink_set_port_modes() - set the port type modes in the ethtool mask
+ * @mask: ethtool link mode mask
+ *
+ * Sets all the port type modes in the ethtool mask.  MAC drivers should
+ * use this in their 'validate' callback.
+ */
 void phylink_set_port_modes(unsigned long *mask)
 {
 	phylink_set(mask, TP);
@@ -117,8 +128,7 @@ static const char *phylink_an_mode_str(unsigned int mode)
 	static const char *modestr[] = {
 		[MLO_AN_PHY] = "phy",
 		[MLO_AN_FIXED] = "fixed",
-		[MLO_AN_SGMII] = "SGMII",
-		[MLO_AN_8023Z] = "802.3z",
+		[MLO_AN_INBAND] = "inband",
 	};
 
 	return mode < ARRAY_SIZE(modestr) ? modestr[mode] : "unknown";
@@ -132,59 +142,64 @@ static int phylink_validate(struct phylink *pl, unsigned long *supported,
 	return phylink_is_empty_linkmode(supported) ? -EINVAL : 0;
 }
 
-static int phylink_parse_fixedlink(struct phylink *pl, struct device_node *np)
+static int phylink_parse_fixedlink(struct phylink *pl,
+				   struct fwnode_handle *fwnode)
 {
-	struct device_node *fixed_node;
+	struct fwnode_handle *fixed_node;
 	const struct phy_setting *s;
 	struct gpio_desc *desc;
-	const __be32 *fixed_prop;
 	u32 speed;
-	int ret, len;
+	int ret;
 
-	fixed_node = of_get_child_by_name(np, "fixed-link");
+	fixed_node = fwnode_get_named_child_node(fwnode, "fixed-link");
 	if (fixed_node) {
-		ret = of_property_read_u32(fixed_node, "speed", &speed);
+		ret = fwnode_property_read_u32(fixed_node, "speed", &speed);
 
 		pl->link_config.speed = speed;
 		pl->link_config.duplex = DUPLEX_HALF;
 
-		if (of_property_read_bool(fixed_node, "full-duplex"))
+		if (fwnode_property_read_bool(fixed_node, "full-duplex"))
 			pl->link_config.duplex = DUPLEX_FULL;
 
 		/* We treat the "pause" and "asym-pause" terminology as
 		 * defining the link partner's ability. */
-		if (of_property_read_bool(fixed_node, "pause"))
+		if (fwnode_property_read_bool(fixed_node, "pause"))
 			pl->link_config.pause |= MLO_PAUSE_SYM;
-		if (of_property_read_bool(fixed_node, "asym-pause"))
+		if (fwnode_property_read_bool(fixed_node, "asym-pause"))
 			pl->link_config.pause |= MLO_PAUSE_ASYM;
 
 		if (ret == 0) {
-			desc = fwnode_get_named_gpiod(&fixed_node->fwnode,
-						      "link-gpios", 0,
-						      GPIOD_IN, "?");
+			desc = fwnode_get_named_gpiod(fixed_node, "link-gpios",
+						      0, GPIOD_IN, "?");
 
 			if (!IS_ERR(desc))
 				pl->link_gpio = desc;
 			else if (desc == ERR_PTR(-EPROBE_DEFER))
 				ret = -EPROBE_DEFER;
 		}
-		of_node_put(fixed_node);
+		fwnode_handle_put(fixed_node);
 
 		if (ret)
 			return ret;
 	} else {
-		fixed_prop = of_get_property(np, "fixed-link", &len);
-		if (!fixed_prop) {
+		u32 prop[5];
+
+		ret = fwnode_property_read_u32_array(fwnode, "fixed-link",
+						     NULL, 0);
+		if (ret != ARRAY_SIZE(prop)) {
 			netdev_err(pl->netdev, "broken fixed-link?\n");
 			return -EINVAL;
 		}
-		if (len == 5 * sizeof(*fixed_prop)) {
-			pl->link_config.duplex = be32_to_cpu(fixed_prop[1]) ?
+
+		ret = fwnode_property_read_u32_array(fwnode, "fixed-link",
+						     prop, ARRAY_SIZE(prop));
+		if (!ret) {
+			pl->link_config.duplex = prop[1] ?
 						DUPLEX_FULL : DUPLEX_HALF;
-			pl->link_config.speed = be32_to_cpu(fixed_prop[2]);
-			if (be32_to_cpu(fixed_prop[3]))
+			pl->link_config.speed = prop[2];
+			if (prop[3])
 				pl->link_config.pause |= MLO_PAUSE_SYM;
-			if (be32_to_cpu(fixed_prop[4]))
+			if (prop[4])
 				pl->link_config.pause |= MLO_PAUSE_ASYM;
 		}
 	}
@@ -220,17 +235,17 @@ static int phylink_parse_fixedlink(struct phylink *pl, struct device_node *np)
 	return 0;
 }
 
-static int phylink_parse_mode(struct phylink *pl, struct device_node *np)
+static int phylink_parse_mode(struct phylink *pl, struct fwnode_handle *fwnode)
 {
-	struct device_node *dn;
+	struct fwnode_handle *dn;
 	const char *managed;
 
-	dn = of_get_child_by_name(np, "fixed-link");
-	if (dn || of_find_property(np, "fixed-link", NULL))
+	dn = fwnode_get_named_child_node(fwnode, "fixed-link");
+	if (dn || fwnode_property_present(fwnode, "fixed-link"))
 		pl->link_an_mode = MLO_AN_FIXED;
-	of_node_put(dn);
+	fwnode_handle_put(dn);
 
-	if (of_property_read_string(np, "managed", &managed) == 0 &&
+	if (fwnode_property_read_string(fwnode, "managed", &managed) == 0 &&
 	    strcmp(managed, "in-band-status") == 0) {
 		if (pl->link_an_mode == MLO_AN_FIXED) {
 			netdev_err(pl->netdev,
@@ -244,6 +259,7 @@ static int phylink_parse_mode(struct phylink *pl, struct device_node *np)
 		phylink_set(pl->supported, Asym_Pause);
 		phylink_set(pl->supported, Pause);
 		pl->link_config.an_enabled = true;
+		pl->link_an_mode = MLO_AN_INBAND;
 
 		switch (pl->link_config.interface) {
 		case PHY_INTERFACE_MODE_SGMII:
@@ -253,17 +269,14 @@ static int phylink_parse_mode(struct phylink *pl, struct device_node *np)
 			phylink_set(pl->supported, 100baseT_Full);
 			phylink_set(pl->supported, 1000baseT_Half);
 			phylink_set(pl->supported, 1000baseT_Full);
-			pl->link_an_mode = MLO_AN_SGMII;
 			break;
 
 		case PHY_INTERFACE_MODE_1000BASEX:
 			phylink_set(pl->supported, 1000baseX_Full);
-			pl->link_an_mode = MLO_AN_8023Z;
 			break;
 
 		case PHY_INTERFACE_MODE_2500BASEX:
 			phylink_set(pl->supported, 2500baseX_Full);
-			pl->link_an_mode = MLO_AN_8023Z;
 			break;
 
 		case PHY_INTERFACE_MODE_10GKR:
@@ -280,7 +293,6 @@ static int phylink_parse_mode(struct phylink *pl, struct device_node *np)
 			phylink_set(pl->supported, 10000baseLR_Full);
 			phylink_set(pl->supported, 10000baseLRM_Full);
 			phylink_set(pl->supported, 10000baseER_Full);
-			pl->link_an_mode = MLO_AN_SGMII;
 			break;
 
 		default:
@@ -320,8 +332,7 @@ static void phylink_mac_config(struct phylink *pl,
 static void phylink_mac_an_restart(struct phylink *pl)
 {
 	if (pl->link_config.an_enabled &&
-	    (pl->link_config.interface == PHY_INTERFACE_MODE_1000BASEX ||
-	     pl->link_config.interface == PHY_INTERFACE_MODE_2500BASEX))
+	    phy_interface_mode_is_8023z(pl->link_config.interface))
 		pl->ops->mac_an_restart(pl->netdev);
 }
 
@@ -423,7 +434,7 @@ static void phylink_resolve(struct work_struct *w)
 			phylink_mac_config(pl, &link_state);
 			break;
 
-		case MLO_AN_SGMII:
+		case MLO_AN_INBAND:
 			phylink_get_mac_state(pl, &link_state);
 			if (pl->phydev) {
 				bool changed = false;
@@ -449,10 +460,6 @@ static void phylink_resolve(struct work_struct *w)
 				}
 			}
 			break;
-
-		case MLO_AN_8023Z:
-			phylink_get_mac_state(pl, &link_state);
-			break;
 		}
 	}
 
@@ -489,15 +496,24 @@ static void phylink_run_resolve(struct phylink *pl)
 
 static const struct sfp_upstream_ops sfp_phylink_ops;
 
-static int phylink_register_sfp(struct phylink *pl, struct device_node *np)
+static int phylink_register_sfp(struct phylink *pl,
+				struct fwnode_handle *fwnode)
 {
-	struct device_node *sfp_np;
+	struct fwnode_reference_args ref;
+	int ret;
 
-	sfp_np = of_parse_phandle(np, "sfp", 0);
-	if (!sfp_np)
-		return 0;
+	ret = fwnode_property_get_reference_args(fwnode, "sfp", NULL,
+						 0, 0, &ref);
+	if (ret < 0) {
+		if (ret == -ENOENT)
+			return 0;
 
-	pl->sfp_bus = sfp_register_upstream(sfp_np, pl->netdev, pl,
+		netdev_err(pl->netdev, "unable to parse \"sfp\" node: %d\n",
+			   ret);
+		return ret;
+	}
+
+	pl->sfp_bus = sfp_register_upstream(ref.fwnode, pl->netdev, pl,
 					    &sfp_phylink_ops);
 	if (!pl->sfp_bus)
 		return -ENOMEM;
@@ -505,7 +521,22 @@ static int phylink_register_sfp(struct phylink *pl, struct device_node *np)
 	return 0;
 }
 
-struct phylink *phylink_create(struct net_device *ndev, struct device_node *np,
+/**
+ * phylink_create() - create a phylink instance
+ * @ndev: a pointer to the &struct net_device
+ * @fwnode: a pointer to a &struct fwnode_handle describing the network
+ *	interface
+ * @iface: the desired link mode defined by &typedef phy_interface_t
+ * @ops: a pointer to a &struct phylink_mac_ops for the MAC.
+ *
+ * Create a new phylink instance, and parse the link parameters found in @np.
+ * This will parse in-band modes, fixed-link or SFP configuration.
+ *
+ * Returns a pointer to a &struct phylink, or an error-pointer value. Users
+ * must use IS_ERR() to check for errors from this function.
+ */
+struct phylink *phylink_create(struct net_device *ndev,
+			       struct fwnode_handle *fwnode,
 			       phy_interface_t iface,
 			       const struct phylink_mac_ops *ops)
 {
@@ -533,21 +564,21 @@ struct phylink *phylink_create(struct net_device *ndev, struct device_node *np,
 	linkmode_copy(pl->link_config.advertising, pl->supported);
 	phylink_validate(pl, pl->supported, &pl->link_config);
 
-	ret = phylink_parse_mode(pl, np);
+	ret = phylink_parse_mode(pl, fwnode);
 	if (ret < 0) {
 		kfree(pl);
 		return ERR_PTR(ret);
 	}
 
 	if (pl->link_an_mode == MLO_AN_FIXED) {
-		ret = phylink_parse_fixedlink(pl, np);
+		ret = phylink_parse_fixedlink(pl, fwnode);
 		if (ret < 0) {
 			kfree(pl);
 			return ERR_PTR(ret);
 		}
 	}
 
-	ret = phylink_register_sfp(pl, np);
+	ret = phylink_register_sfp(pl, fwnode);
 	if (ret < 0) {
 		kfree(pl);
 		return ERR_PTR(ret);
@@ -557,6 +588,13 @@ struct phylink *phylink_create(struct net_device *ndev, struct device_node *np,
 }
 EXPORT_SYMBOL_GPL(phylink_create);
 
+/**
+ * phylink_destroy() - cleanup and destroy the phylink instance
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ *
+ * Destroy a phylink instance. Any PHY that has been attached must have been
+ * cleaned up via phylink_disconnect_phy() prior to calling this function.
+ */
 void phylink_destroy(struct phylink *pl)
 {
 	if (pl->sfp_bus)
@@ -653,10 +691,30 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy)
 	return 0;
 }
 
+/**
+ * phylink_connect_phy() - connect a PHY to the phylink instance
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @phy: a pointer to a &struct phy_device.
+ *
+ * Connect @phy to the phylink instance specified by @pl by calling
+ * phy_attach_direct(). Configure the @phy according to the MAC driver's
+ * capabilities, start the PHYLIB state machine and enable any interrupts
+ * that the PHY supports.
+ *
+ * This updates the phylink's ethtool supported and advertising link mode
+ * masks.
+ *
+ * Returns 0 on success or a negative errno.
+ */
 int phylink_connect_phy(struct phylink *pl, struct phy_device *phy)
 {
 	int ret;
 
+	if (WARN_ON(pl->link_an_mode == MLO_AN_FIXED ||
+		    (pl->link_an_mode == MLO_AN_INBAND &&
+		     phy_interface_mode_is_8023z(pl->link_interface))))
+		return -EINVAL;
+
 	ret = phy_attach_direct(pl->netdev, phy, 0, pl->link_interface);
 	if (ret)
 		return ret;
@@ -669,14 +727,27 @@ int phylink_connect_phy(struct phylink *pl, struct phy_device *phy)
 }
 EXPORT_SYMBOL_GPL(phylink_connect_phy);
 
+/**
+ * phylink_of_phy_connect() - connect the PHY specified in the DT mode.
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @dn: a pointer to a &struct device_node.
+ *
+ * Connect the phy specified in the device node @dn to the phylink instance
+ * specified by @pl. Actions specified in phylink_connect_phy() will be
+ * performed.
+ *
+ * Returns 0 on success or a negative errno.
+ */
 int phylink_of_phy_connect(struct phylink *pl, struct device_node *dn)
 {
 	struct device_node *phy_node;
 	struct phy_device *phy_dev;
 	int ret;
 
-	/* Fixed links are handled without needing a PHY */
-	if (pl->link_an_mode == MLO_AN_FIXED)
+	/* Fixed links and 802.3z are handled without needing a PHY */
+	if (pl->link_an_mode == MLO_AN_FIXED ||
+	    (pl->link_an_mode == MLO_AN_INBAND &&
+	     phy_interface_mode_is_8023z(pl->link_interface)))
 		return 0;
 
 	phy_node = of_parse_phandle(dn, "phy-handle", 0);
@@ -708,6 +779,13 @@ int phylink_of_phy_connect(struct phylink *pl, struct device_node *dn)
 }
 EXPORT_SYMBOL_GPL(phylink_of_phy_connect);
 
+/**
+ * phylink_disconnect_phy() - disconnect any PHY attached to the phylink
+ *   instance.
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ *
+ * Disconnect any current PHY from the phylink instance described by @pl.
+ */
 void phylink_disconnect_phy(struct phylink *pl)
 {
 	struct phy_device *phy;
@@ -729,6 +807,14 @@ void phylink_disconnect_phy(struct phylink *pl)
 }
 EXPORT_SYMBOL_GPL(phylink_disconnect_phy);
 
+/**
+ * phylink_mac_change() - notify phylink of a change in MAC state
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @up: indicates whether the link is currently up.
+ *
+ * The MAC driver should call this driver when the state of its link
+ * changes (eg, link failure, new negotiation results, etc.)
+ */
 void phylink_mac_change(struct phylink *pl, bool up)
 {
 	if (!up)
@@ -738,6 +824,14 @@ void phylink_mac_change(struct phylink *pl, bool up)
 }
 EXPORT_SYMBOL_GPL(phylink_mac_change);
 
+/**
+ * phylink_start() - start a phylink instance
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ *
+ * Start the phylink instance specified by @pl, configuring the MAC for the
+ * desired link mode(s) and negotiation style. This should be called from the
+ * network device driver's &struct net_device_ops ndo_open() method.
+ */
 void phylink_start(struct phylink *pl)
 {
 	WARN_ON(!lockdep_rtnl_is_held());
@@ -753,6 +847,12 @@ void phylink_start(struct phylink *pl)
 	phylink_resolve_flow(pl, &pl->link_config);
 	phylink_mac_config(pl, &pl->link_config);
 
+	/* Restart autonegotiation if using 802.3z to ensure that the link
+	 * parameters are properly negotiated.  This is necessary for DSA
+	 * switches using 802.3z negotiation to ensure they see our modes.
+	 */
+	phylink_mac_an_restart(pl);
+
 	clear_bit(PHYLINK_DISABLE_STOPPED, &pl->phylink_disable_state);
 	phylink_run_resolve(pl);
 
@@ -763,6 +863,15 @@ void phylink_start(struct phylink *pl)
 }
 EXPORT_SYMBOL_GPL(phylink_start);
 
+/**
+ * phylink_stop() - stop a phylink instance
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ *
+ * Stop the phylink instance specified by @pl. This should be called from the
+ * network device driver's &struct net_device_ops ndo_stop() method.  The
+ * network device's carrier state should not be changed prior to calling this
+ * function.
+ */
 void phylink_stop(struct phylink *pl)
 {
 	WARN_ON(!lockdep_rtnl_is_held());
@@ -778,6 +887,15 @@ void phylink_stop(struct phylink *pl)
 }
 EXPORT_SYMBOL_GPL(phylink_stop);
 
+/**
+ * phylink_ethtool_get_wol() - get the wake on lan parameters for the PHY
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @wol: a pointer to &struct ethtool_wolinfo to hold the read parameters
+ *
+ * Read the wake on lan parameters from the PHY attached to the phylink
+ * instance specified by @pl. If no PHY is currently attached, report no
+ * support for wake on lan.
+ */
 void phylink_ethtool_get_wol(struct phylink *pl, struct ethtool_wolinfo *wol)
 {
 	WARN_ON(!lockdep_rtnl_is_held());
@@ -790,6 +908,17 @@ void phylink_ethtool_get_wol(struct phylink *pl, struct ethtool_wolinfo *wol)
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_get_wol);
 
+/**
+ * phylink_ethtool_set_wol() - set wake on lan parameters
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @wol: a pointer to &struct ethtool_wolinfo for the desired parameters
+ *
+ * Set the wake on lan parameters for the PHY attached to the phylink
+ * instance specified by @pl. If no PHY is attached, returns %EOPNOTSUPP
+ * error.
+ *
+ * Returns zero on success or negative errno code.
+ */
 int phylink_ethtool_set_wol(struct phylink *pl, struct ethtool_wolinfo *wol)
 {
 	int ret = -EOPNOTSUPP;
@@ -825,6 +954,15 @@ static void phylink_get_ksettings(const struct phylink_link_state *state,
 				AUTONEG_DISABLE;
 }
 
+/**
+ * phylink_ethtool_ksettings_get() - get the current link settings
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @kset: a pointer to a &struct ethtool_link_ksettings to hold link settings
+ *
+ * Read the current link settings for the phylink instance specified by @pl.
+ * This will be the link settings read from the MAC, PHY or fixed link
+ * settings depending on the current negotiation mode.
+ */
 int phylink_ethtool_ksettings_get(struct phylink *pl,
 				  struct ethtool_link_ksettings *kset)
 {
@@ -850,14 +988,13 @@ int phylink_ethtool_ksettings_get(struct phylink *pl,
 		phylink_get_ksettings(&link_state, kset);
 		break;
 
-	case MLO_AN_SGMII:
+	case MLO_AN_INBAND:
 		/* If there is a phy attached, then use the reported
 		 * settings from the phy with no modification.
 		 */
 		if (pl->phydev)
 			break;
 
-	case MLO_AN_8023Z:
 		phylink_get_mac_state(pl, &link_state);
 
 		/* The MAC is reporting the link results from its own PCS
@@ -872,6 +1009,11 @@ int phylink_ethtool_ksettings_get(struct phylink *pl,
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_ksettings_get);
 
+/**
+ * phylink_ethtool_ksettings_set() - set the link settings
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @kset: a pointer to a &struct ethtool_link_ksettings for the desired modes
+ */
 int phylink_ethtool_ksettings_set(struct phylink *pl,
 				  const struct ethtool_link_ksettings *kset)
 {
@@ -965,6 +1107,17 @@ int phylink_ethtool_ksettings_set(struct phylink *pl,
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_ksettings_set);
 
+/**
+ * phylink_ethtool_nway_reset() - restart negotiation
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ *
+ * Restart negotiation for the phylink instance specified by @pl. This will
+ * cause any attached phy to restart negotiation with the link partner, and
+ * if the MAC is in a BaseX mode, the MAC will also be requested to restart
+ * negotiation.
+ *
+ * Returns zero on success, or negative error code.
+ */
 int phylink_ethtool_nway_reset(struct phylink *pl)
 {
 	int ret = 0;
@@ -979,6 +1132,11 @@ int phylink_ethtool_nway_reset(struct phylink *pl)
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_nway_reset);
 
+/**
+ * phylink_ethtool_get_pauseparam() - get the current pause parameters
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @pause: a pointer to a &struct ethtool_pauseparam
+ */
 void phylink_ethtool_get_pauseparam(struct phylink *pl,
 				    struct ethtool_pauseparam *pause)
 {
@@ -990,6 +1148,11 @@ void phylink_ethtool_get_pauseparam(struct phylink *pl,
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_get_pauseparam);
 
+/**
+ * phylink_ethtool_set_pauseparam() - set the current pause parameters
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @pause: a pointer to a &struct ethtool_pauseparam
+ */
 int phylink_ethtool_set_pauseparam(struct phylink *pl,
 				   struct ethtool_pauseparam *pause)
 {
@@ -1028,8 +1191,7 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl,
 			phylink_mac_config(pl, config);
 			break;
 
-		case MLO_AN_SGMII:
-		case MLO_AN_8023Z:
+		case MLO_AN_INBAND:
 			phylink_mac_config(pl, config);
 			phylink_mac_an_restart(pl);
 			break;
@@ -1068,19 +1230,16 @@ int phylink_ethtool_get_module_eeprom(struct phylink *pl,
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_get_module_eeprom);
 
-int phylink_init_eee(struct phylink *pl, bool clk_stop_enable)
-{
-	int ret = -EPROTONOSUPPORT;
-
-	WARN_ON(!lockdep_rtnl_is_held());
-
-	if (pl->phydev)
-		ret = phy_init_eee(pl->phydev, clk_stop_enable);
-
-	return ret;
-}
-EXPORT_SYMBOL_GPL(phylink_init_eee);
-
+/**
+ * phylink_ethtool_get_eee_err() - read the energy efficient ethernet error
+ *   counter
+ * @pl: a pointer to a &struct phylink returned from phylink_create().
+ *
+ * Read the Energy Efficient Ethernet error counter from the PHY associated
+ * with the phylink instance specified by @pl.
+ *
+ * Returns positive error counter value, or negative error code.
+ */
 int phylink_get_eee_err(struct phylink *pl)
 {
 	int ret = 0;
@@ -1094,6 +1253,11 @@ int phylink_get_eee_err(struct phylink *pl)
 }
 EXPORT_SYMBOL_GPL(phylink_get_eee_err);
 
+/**
+ * phylink_ethtool_get_eee() - read the energy efficient ethernet parameters
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @eee: a pointer to a &struct ethtool_eee for the read parameters
+ */
 int phylink_ethtool_get_eee(struct phylink *pl, struct ethtool_eee *eee)
 {
 	int ret = -EOPNOTSUPP;
@@ -1107,6 +1271,11 @@ int phylink_ethtool_get_eee(struct phylink *pl, struct ethtool_eee *eee)
 }
 EXPORT_SYMBOL_GPL(phylink_ethtool_get_eee);
 
+/**
+ * phylink_ethtool_set_eee() - set the energy efficient ethernet parameters
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @eee: a pointer to a &struct ethtool_eee for the desired parameters
+ */
 int phylink_ethtool_set_eee(struct phylink *pl, struct ethtool_eee *eee)
 {
 	int ret = -EOPNOTSUPP;
@@ -1246,9 +1415,7 @@ static int phylink_mii_read(struct phylink *pl, unsigned int phy_id,
 	case MLO_AN_PHY:
 		return -EOPNOTSUPP;
 
-	case MLO_AN_SGMII:
-		/* No phy, fall through to 8023z method */
-	case MLO_AN_8023Z:
+	case MLO_AN_INBAND:
 		if (phy_id == 0) {
 			val = phylink_get_mac_state(pl, &state);
 			if (val < 0)
@@ -1273,15 +1440,31 @@ static int phylink_mii_write(struct phylink *pl, unsigned int phy_id,
 	case MLO_AN_PHY:
 		return -EOPNOTSUPP;
 
-	case MLO_AN_SGMII:
-		/* No phy, fall through to 8023z method */
-	case MLO_AN_8023Z:
+	case MLO_AN_INBAND:
 		break;
 	}
 
 	return 0;
 }
 
+/**
+ * phylink_mii_ioctl() - generic mii ioctl interface
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ * @ifr: a pointer to a &struct ifreq for socket ioctls
+ * @cmd: ioctl cmd to execute
+ *
+ * Perform the specified MII ioctl on the PHY attached to the phylink instance
+ * specified by @pl. If no PHY is attached, emulate the presence of the PHY.
+ *
+ * Returns: zero on success or negative error code.
+ *
+ * %SIOCGMIIPHY:
+ *  read register from the current PHY.
+ * %SIOCGMIIREG:
+ *  read register from the specified PHY.
+ * %SIOCSMIIREG:
+ *  set a register on the specified PHY.
+ */
 int phylink_mii_ioctl(struct phylink *pl, struct ifreq *ifr, int cmd)
 {
 	struct mii_ioctl_data *mii = if_mii(ifr);
@@ -1290,7 +1473,7 @@ int phylink_mii_ioctl(struct phylink *pl, struct ifreq *ifr, int cmd)
 	WARN_ON(!lockdep_rtnl_is_held());
 
 	if (pl->phydev) {
-		/* PHYs only exist for MLO_AN_PHY and MLO_AN_SGMII */
+		/* PHYs only exist for MLO_AN_PHY and SGMII */
 		switch (cmd) {
 		case SIOCGMIIPHY:
 			mii->phy_id = pl->phydev->mdio.addr;
@@ -1359,10 +1542,10 @@ static int phylink_sfp_module_insert(void *upstream,
 
 	switch (iface) {
 	case PHY_INTERFACE_MODE_SGMII:
-		mode = MLO_AN_SGMII;
-		break;
 	case PHY_INTERFACE_MODE_1000BASEX:
-		mode = MLO_AN_8023Z;
+	case PHY_INTERFACE_MODE_2500BASEX:
+	case PHY_INTERFACE_MODE_10GKR:
+		mode = MLO_AN_INBAND;
 		break;
 	default:
 		return -EINVAL;
@@ -1389,7 +1572,7 @@ static int phylink_sfp_module_insert(void *upstream,
 		   phylink_an_mode_str(mode), phy_modes(config.interface),
 		   __ETHTOOL_LINK_MODE_MASK_NBITS, support);
 
-	if (mode == MLO_AN_8023Z && pl->phydev)
+	if (phy_interface_mode_is_8023z(iface) && pl->phydev)
 		return -EINVAL;
 
 	changed = !bitmap_equal(pl->supported, support,
diff --git a/drivers/net/phy/qsemi.c b/drivers/net/phy/qsemi.c
index dbef8002bc28..889a4dce1648 100644
--- a/drivers/net/phy/qsemi.c
+++ b/drivers/net/phy/qsemi.c
@@ -118,8 +118,6 @@ static struct phy_driver qs6612_driver[] = { {
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= qs6612_config_init,
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.ack_interrupt	= qs6612_ack_interrupt,
 	.config_intr	= qs6612_config_intr,
 } };
diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index eda0a6e86918..7c1bf688dd48 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -13,29 +13,67 @@
  * option) any later version.
  *
  */
+#include <linux/bitops.h>
 #include <linux/phy.h>
 #include <linux/module.h>
 
-#define RTL821x_PHYSR		0x11
-#define RTL821x_PHYSR_DUPLEX	0x2000
-#define RTL821x_PHYSR_SPEED	0xc000
-#define RTL821x_INER		0x12
-#define RTL821x_INER_INIT	0x6400
-#define RTL821x_INSR		0x13
-#define RTL821x_PAGE_SELECT	0x1f
-#define RTL8211E_INER_LINK_STATUS 0x400
+#define RTL821x_PHYSR				0x11
+#define RTL821x_PHYSR_DUPLEX			BIT(13)
+#define RTL821x_PHYSR_SPEED			GENMASK(15, 14)
 
-#define RTL8211F_INER_LINK_STATUS 0x0010
-#define RTL8211F_INSR		0x1d
-#define RTL8211F_TX_DELAY	0x100
+#define RTL821x_INER				0x12
+#define RTL8211B_INER_INIT			0x6400
+#define RTL8211E_INER_LINK_STATUS		BIT(10)
+#define RTL8211F_INER_LINK_STATUS		BIT(4)
 
-#define RTL8201F_ISR		0x1e
-#define RTL8201F_IER		0x13
+#define RTL821x_INSR				0x13
+
+#define RTL821x_PAGE_SELECT			0x1f
+
+#define RTL8211F_INSR				0x1d
+
+#define RTL8211F_TX_DELAY			BIT(8)
+
+#define RTL8201F_ISR				0x1e
+#define RTL8201F_IER				0x13
 
 MODULE_DESCRIPTION("Realtek PHY driver");
 MODULE_AUTHOR("Johnson Leung");
 MODULE_LICENSE("GPL");
 
+static int rtl8211x_page_read(struct phy_device *phydev, u16 page, u16 address)
+{
+	int ret;
+
+	ret = phy_write(phydev, RTL821x_PAGE_SELECT, page);
+	if (ret)
+		return ret;
+
+	ret = phy_read(phydev, address);
+
+	/* restore to default page 0 */
+	phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+
+	return ret;
+}
+
+static int rtl8211x_page_write(struct phy_device *phydev, u16 page,
+			       u16 address, u16 val)
+{
+	int ret;
+
+	ret = phy_write(phydev, RTL821x_PAGE_SELECT, page);
+	if (ret)
+		return ret;
+
+	ret = phy_write(phydev, address, val);
+
+	/* restore to default page 0 */
+	phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+
+	return ret;
+}
+
 static int rtl8201_ack_interrupt(struct phy_device *phydev)
 {
 	int err;
@@ -58,31 +96,21 @@ static int rtl8211f_ack_interrupt(struct phy_device *phydev)
 {
 	int err;
 
-	phy_write(phydev, RTL821x_PAGE_SELECT, 0xa43);
-	err = phy_read(phydev, RTL8211F_INSR);
-	/* restore to default page 0 */
-	phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+	err = rtl8211x_page_read(phydev, 0xa43, RTL8211F_INSR);
 
 	return (err < 0) ? err : 0;
 }
 
 static int rtl8201_config_intr(struct phy_device *phydev)
 {
-	int err;
-
-	/* switch to page 7 */
-	phy_write(phydev, RTL821x_PAGE_SELECT, 0x7);
+	u16 val;
 
 	if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
-		err = phy_write(phydev, RTL8201F_IER,
-				BIT(13) | BIT(12) | BIT(11));
+		val = BIT(13) | BIT(12) | BIT(11);
 	else
-		err = phy_write(phydev, RTL8201F_IER, 0);
-
-	/* restore to default page 0 */
-	phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+		val = 0;
 
-	return err;
+	return rtl8211x_page_write(phydev, 0x7, RTL8201F_IER, val);
 }
 
 static int rtl8211b_config_intr(struct phy_device *phydev)
@@ -91,7 +119,7 @@ static int rtl8211b_config_intr(struct phy_device *phydev)
 
 	if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
 		err = phy_write(phydev, RTL821x_INER,
-				RTL821x_INER_INIT);
+				RTL8211B_INER_INIT);
 	else
 		err = phy_write(phydev, RTL821x_INER, 0);
 
@@ -113,41 +141,41 @@ static int rtl8211e_config_intr(struct phy_device *phydev)
 
 static int rtl8211f_config_intr(struct phy_device *phydev)
 {
-	int err;
+	u16 val;
 
-	phy_write(phydev, RTL821x_PAGE_SELECT, 0xa42);
 	if (phydev->interrupts == PHY_INTERRUPT_ENABLED)
-		err = phy_write(phydev, RTL821x_INER,
-				RTL8211F_INER_LINK_STATUS);
+		val = RTL8211F_INER_LINK_STATUS;
 	else
-		err = phy_write(phydev, RTL821x_INER, 0);
-	phy_write(phydev, RTL821x_PAGE_SELECT, 0);
+		val = 0;
 
-	return err;
+	return rtl8211x_page_write(phydev, 0xa42, RTL821x_INER, val);
 }
 
 static int rtl8211f_config_init(struct phy_device *phydev)
 {
 	int ret;
-	u16 reg;
+	u16 val;
 
 	ret = genphy_config_init(phydev);
 	if (ret < 0)
 		return ret;
 
-	phy_write(phydev, RTL821x_PAGE_SELECT, 0xd08);
-	reg = phy_read(phydev, 0x11);
+	ret = rtl8211x_page_read(phydev, 0xd08, 0x11);
+	if (ret < 0)
+		return ret;
+
+	val = ret & 0xffff;
 
 	/* enable TX-delay for rgmii-id and rgmii-txid, otherwise disable it */
 	if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID ||
 	    phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
-		reg |= RTL8211F_TX_DELAY;
+		val |= RTL8211F_TX_DELAY;
 	else
-		reg &= ~RTL8211F_TX_DELAY;
+		val &= ~RTL8211F_TX_DELAY;
 
-	phy_write(phydev, 0x11, reg);
-	/* restore to default page 0 */
-	phy_write(phydev, RTL821x_PAGE_SELECT, 0x0);
+	ret = rtl8211x_page_write(phydev, 0xd08, 0x11, val);
+	if (ret)
+		return ret;
 
 	return 0;
 }
@@ -159,16 +187,12 @@ static struct phy_driver realtek_drvs[] = {
 		.phy_id_mask    = 0x0000ffff,
 		.features       = PHY_BASIC_FEATURES,
 		.flags          = PHY_HAS_INTERRUPT,
-		.config_aneg    = &genphy_config_aneg,
-		.read_status    = &genphy_read_status,
 	}, {
 		.phy_id		= 0x001cc816,
 		.name		= "RTL8201F 10/100Mbps Ethernet",
 		.phy_id_mask	= 0x001fffff,
 		.features	= PHY_BASIC_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
-		.config_aneg	= &genphy_config_aneg,
-		.read_status	= &genphy_read_status,
 		.ack_interrupt	= &rtl8201_ack_interrupt,
 		.config_intr	= &rtl8201_config_intr,
 		.suspend	= genphy_suspend,
@@ -179,8 +203,6 @@ static struct phy_driver realtek_drvs[] = {
 		.phy_id_mask	= 0x001fffff,
 		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
-		.config_aneg	= &genphy_config_aneg,
-		.read_status	= &genphy_read_status,
 		.ack_interrupt	= &rtl821x_ack_interrupt,
 		.config_intr	= &rtl8211b_config_intr,
 	}, {
@@ -189,8 +211,6 @@ static struct phy_driver realtek_drvs[] = {
 		.phy_id_mask	= 0x001fffff,
 		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
-		.config_aneg	= genphy_config_aneg,
-		.read_status	= genphy_read_status,
 		.ack_interrupt	= rtl821x_ack_interrupt,
 		.config_intr	= rtl8211e_config_intr,
 		.suspend	= genphy_suspend,
@@ -201,8 +221,6 @@ static struct phy_driver realtek_drvs[] = {
 		.phy_id_mask	= 0x001fffff,
 		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
-		.config_aneg	= &genphy_config_aneg,
-		.read_status	= &genphy_read_status,
 		.ack_interrupt	= &rtl821x_ack_interrupt,
 		.config_intr	= &rtl8211e_config_intr,
 		.suspend	= genphy_suspend,
@@ -213,9 +231,7 @@ static struct phy_driver realtek_drvs[] = {
 		.phy_id_mask	= 0x001fffff,
 		.features	= PHY_GBIT_FEATURES,
 		.flags		= PHY_HAS_INTERRUPT,
-		.config_aneg	= &genphy_config_aneg,
 		.config_init	= &rtl8211f_config_init,
-		.read_status	= &genphy_read_status,
 		.ack_interrupt	= &rtl8211f_ack_interrupt,
 		.config_intr	= &rtl8211f_config_intr,
 		.suspend	= genphy_suspend,
diff --git a/drivers/net/phy/rockchip.c b/drivers/net/phy/rockchip.c
index c092af137056..f1da70b9b55f 100644
--- a/drivers/net/phy/rockchip.c
+++ b/drivers/net/phy/rockchip.c
@@ -213,7 +213,6 @@ static struct phy_driver rockchip_phy_driver[] = {
 	.soft_reset		= genphy_soft_reset,
 	.config_init		= rockchip_integrated_phy_config_init,
 	.config_aneg		= rockchip_config_aneg,
-	.read_status		= genphy_read_status,
 	.suspend		= genphy_suspend,
 	.resume			= rockchip_phy_resume,
 },
diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index 8a1b1f4c1b7c..1356dba0d9d3 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -8,10 +8,14 @@
 
 #include "sfp.h"
 
+/**
+ * struct sfp_bus - internal representation of a sfp bus
+ */
 struct sfp_bus {
+	/* private: */
 	struct kref kref;
 	struct list_head node;
-	struct device_node *device_node;
+	struct fwnode_handle *fwnode;
 
 	const struct sfp_socket_ops *socket_ops;
 	struct device *sfp_dev;
@@ -26,6 +30,20 @@ struct sfp_bus {
 	bool started;
 };
 
+/**
+ * sfp_parse_port() - Parse the EEPROM base ID, setting the port type
+ * @bus: a pointer to the &struct sfp_bus structure for the sfp module
+ * @id: a pointer to the module's &struct sfp_eeprom_id
+ * @support: optional pointer to an array of unsigned long for the
+ *   ethtool support mask
+ *
+ * Parse the EEPROM identification given in @id, and return one of
+ * %PORT_TP, %PORT_FIBRE or %PORT_OTHER. If @support is non-%NULL,
+ * also set the ethtool %ETHTOOL_LINK_MODE_xxx_BIT corresponding with
+ * the connector type.
+ *
+ * If the port type is not known, returns %PORT_OTHER.
+ */
 int sfp_parse_port(struct sfp_bus *bus, const struct sfp_eeprom_id *id,
 		   unsigned long *support)
 {
@@ -78,6 +96,24 @@ int sfp_parse_port(struct sfp_bus *bus, const struct sfp_eeprom_id *id,
 }
 EXPORT_SYMBOL_GPL(sfp_parse_port);
 
+/**
+ * sfp_parse_interface() - Parse the phy_interface_t
+ * @bus: a pointer to the &struct sfp_bus structure for the sfp module
+ * @id: a pointer to the module's &struct sfp_eeprom_id
+ *
+ * Derive the phy_interface_t mode for the information found in the
+ * module's identifying EEPROM. There is no standard or defined way
+ * to derive this information, so we use some heuristics.
+ *
+ * If the encoding is 64b66b, then the module must be >= 10G, so
+ * return %PHY_INTERFACE_MODE_10GKR.
+ *
+ * If it's 8b10b, then it's 1G or slower. If it's definitely a fibre
+ * module, return %PHY_INTERFACE_MODE_1000BASEX mode, otherwise return
+ * %PHY_INTERFACE_MODE_SGMII mode.
+ *
+ * If the encoding is not known, return %PHY_INTERFACE_MODE_NA.
+ */
 phy_interface_t sfp_parse_interface(struct sfp_bus *bus,
 				    const struct sfp_eeprom_id *id)
 {
@@ -117,6 +153,15 @@ phy_interface_t sfp_parse_interface(struct sfp_bus *bus,
 }
 EXPORT_SYMBOL_GPL(sfp_parse_interface);
 
+/**
+ * sfp_parse_support() - Parse the eeprom id for supported link modes
+ * @bus: a pointer to the &struct sfp_bus structure for the sfp module
+ * @id: a pointer to the module's &struct sfp_eeprom_id
+ * @support: pointer to an array of unsigned long for the ethtool support mask
+ *
+ * Parse the EEPROM identification information and derive the supported
+ * ethtool link modes for the module.
+ */
 void sfp_parse_support(struct sfp_bus *bus, const struct sfp_eeprom_id *id,
 		       unsigned long *support)
 {
@@ -215,7 +260,7 @@ static const struct sfp_upstream_ops *sfp_get_upstream_ops(struct sfp_bus *bus)
 	return bus->registered ? bus->upstream_ops : NULL;
 }
 
-static struct sfp_bus *sfp_bus_get(struct device_node *np)
+static struct sfp_bus *sfp_bus_get(struct fwnode_handle *fwnode)
 {
 	struct sfp_bus *sfp, *new, *found = NULL;
 
@@ -224,7 +269,7 @@ static struct sfp_bus *sfp_bus_get(struct device_node *np)
 	mutex_lock(&sfp_mutex);
 
 	list_for_each_entry(sfp, &sfp_buses, node) {
-		if (sfp->device_node == np) {
+		if (sfp->fwnode == fwnode) {
 			kref_get(&sfp->kref);
 			found = sfp;
 			break;
@@ -233,7 +278,7 @@ static struct sfp_bus *sfp_bus_get(struct device_node *np)
 
 	if (!found && new) {
 		kref_init(&new->kref);
-		new->device_node = np;
+		new->fwnode = fwnode;
 		list_add(&new->node, &sfp_buses);
 		found = new;
 		new = NULL;
@@ -246,7 +291,7 @@ static struct sfp_bus *sfp_bus_get(struct device_node *np)
 	return found;
 }
 
-static void sfp_bus_release(struct kref *kref) __releases(sfp_mutex)
+static void sfp_bus_release(struct kref *kref)
 {
 	struct sfp_bus *bus = container_of(kref, struct sfp_bus, kref);
 
@@ -293,6 +338,16 @@ static void sfp_unregister_bus(struct sfp_bus *bus)
 	bus->registered = false;
 }
 
+/**
+ * sfp_get_module_info() - Get the ethtool_modinfo for a SFP module
+ * @bus: a pointer to the &struct sfp_bus structure for the sfp module
+ * @modinfo: a &struct ethtool_modinfo
+ *
+ * Fill in the type and eeprom_len parameters in @modinfo for a module on
+ * the sfp bus specified by @bus.
+ *
+ * Returns 0 on success or a negative errno number.
+ */
 int sfp_get_module_info(struct sfp_bus *bus, struct ethtool_modinfo *modinfo)
 {
 	if (!bus->registered)
@@ -301,6 +356,17 @@ int sfp_get_module_info(struct sfp_bus *bus, struct ethtool_modinfo *modinfo)
 }
 EXPORT_SYMBOL_GPL(sfp_get_module_info);
 
+/**
+ * sfp_get_module_eeprom() - Read the SFP module EEPROM
+ * @bus: a pointer to the &struct sfp_bus structure for the sfp module
+ * @ee: a &struct ethtool_eeprom
+ * @data: buffer to contain the EEPROM data (must be at least @ee->len bytes)
+ *
+ * Read the EEPROM as specified by the supplied @ee. See the documentation
+ * for &struct ethtool_eeprom for the region to be read.
+ *
+ * Returns 0 on success or a negative errno number.
+ */
 int sfp_get_module_eeprom(struct sfp_bus *bus, struct ethtool_eeprom *ee,
 			  u8 *data)
 {
@@ -310,6 +376,15 @@ int sfp_get_module_eeprom(struct sfp_bus *bus, struct ethtool_eeprom *ee,
 }
 EXPORT_SYMBOL_GPL(sfp_get_module_eeprom);
 
+/**
+ * sfp_upstream_start() - Inform the SFP that the network device is up
+ * @bus: a pointer to the &struct sfp_bus structure for the sfp module
+ *
+ * Inform the SFP socket that the network device is now up, so that the
+ * module can be enabled by allowing TX_DISABLE to be deasserted. This
+ * should be called from the network device driver's &struct net_device_ops
+ * ndo_open() method.
+ */
 void sfp_upstream_start(struct sfp_bus *bus)
 {
 	if (bus->registered)
@@ -318,6 +393,15 @@ void sfp_upstream_start(struct sfp_bus *bus)
 }
 EXPORT_SYMBOL_GPL(sfp_upstream_start);
 
+/**
+ * sfp_upstream_stop() - Inform the SFP that the network device is down
+ * @bus: a pointer to the &struct sfp_bus structure for the sfp module
+ *
+ * Inform the SFP socket that the network device is now up, so that the
+ * module can be disabled by asserting TX_DISABLE, disabling the laser
+ * in optical modules. This should be called from the network device
+ * driver's &struct net_device_ops ndo_stop() method.
+ */
 void sfp_upstream_stop(struct sfp_bus *bus)
 {
 	if (bus->registered)
@@ -326,11 +410,24 @@ void sfp_upstream_stop(struct sfp_bus *bus)
 }
 EXPORT_SYMBOL_GPL(sfp_upstream_stop);
 
-struct sfp_bus *sfp_register_upstream(struct device_node *np,
+/**
+ * sfp_register_upstream() - Register the neighbouring device
+ * @np: device node for the SFP bus
+ * @ndev: network device associated with the interface
+ * @upstream: the upstream private data
+ * @ops: the upstream's &struct sfp_upstream_ops
+ *
+ * Register the upstream device (eg, PHY) with the SFP bus. MAC drivers
+ * should use phylink, which will call this function for them. Returns
+ * a pointer to the allocated &struct sfp_bus.
+ *
+ * On error, returns %NULL.
+ */
+struct sfp_bus *sfp_register_upstream(struct fwnode_handle *fwnode,
 				      struct net_device *ndev, void *upstream,
 				      const struct sfp_upstream_ops *ops)
 {
-	struct sfp_bus *bus = sfp_bus_get(np);
+	struct sfp_bus *bus = sfp_bus_get(fwnode);
 	int ret = 0;
 
 	if (bus) {
@@ -353,6 +450,13 @@ struct sfp_bus *sfp_register_upstream(struct device_node *np,
 }
 EXPORT_SYMBOL_GPL(sfp_register_upstream);
 
+/**
+ * sfp_unregister_upstream() - Unregister sfp bus
+ * @bus: a pointer to the &struct sfp_bus structure for the sfp module
+ *
+ * Unregister a previously registered upstream connection for the SFP
+ * module. @bus is returned from sfp_register_upstream().
+ */
 void sfp_unregister_upstream(struct sfp_bus *bus)
 {
 	rtnl_lock();
@@ -433,7 +537,7 @@ EXPORT_SYMBOL_GPL(sfp_module_remove);
 struct sfp_bus *sfp_register_socket(struct device *dev, struct sfp *sfp,
 				    const struct sfp_socket_ops *ops)
 {
-	struct sfp_bus *bus = sfp_bus_get(dev->of_node);
+	struct sfp_bus *bus = sfp_bus_get(dev->fwnode);
 	int ret = 0;
 
 	if (bus) {
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index 2306bfae057f..a1961ba87e2b 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -227,8 +227,6 @@ static struct phy_driver smsc_phy_driver[] = {
 	.probe		= smsc_phy_probe,
 
 	/* basic functions */
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.config_init	= smsc_phy_config_init,
 	.soft_reset	= smsc_phy_reset,
 
@@ -249,8 +247,6 @@ static struct phy_driver smsc_phy_driver[] = {
 	.probe		= smsc_phy_probe,
 
 	/* basic functions */
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.config_init	= smsc_phy_config_init,
 	.soft_reset	= smsc_phy_reset,
 
@@ -276,7 +272,6 @@ static struct phy_driver smsc_phy_driver[] = {
 	.probe		= smsc_phy_probe,
 
 	/* basic functions */
-	.config_aneg	= genphy_config_aneg,
 	.read_status	= lan87xx_read_status,
 	.config_init	= smsc_phy_config_init,
 	.soft_reset	= smsc_phy_reset,
@@ -303,8 +298,6 @@ static struct phy_driver smsc_phy_driver[] = {
 	.probe		= smsc_phy_probe,
 
 	/* basic functions */
-	.config_aneg	= genphy_config_aneg,
-	.read_status	= genphy_read_status,
 	.config_init	= lan911x_config_init,
 
 	/* IRQ related */
@@ -324,7 +317,6 @@ static struct phy_driver smsc_phy_driver[] = {
 	.probe		= smsc_phy_probe,
 
 	/* basic functions */
-	.config_aneg	= genphy_config_aneg,
 	.read_status	= lan87xx_read_status,
 	.config_init	= smsc_phy_config_init,
 	.soft_reset	= smsc_phy_reset,
@@ -351,7 +343,6 @@ static struct phy_driver smsc_phy_driver[] = {
 	.probe		= smsc_phy_probe,
 
 	/* basic functions */
-	.config_aneg	= genphy_config_aneg,
 	.read_status	= lan87xx_read_status,
 	.config_init	= smsc_phy_config_init,
 	.soft_reset	= smsc_phy_reset,
diff --git a/drivers/net/phy/ste10Xp.c b/drivers/net/phy/ste10Xp.c
index d00cfb64529e..fbd548a1ad84 100644
--- a/drivers/net/phy/ste10Xp.c
+++ b/drivers/net/phy/ste10Xp.c
@@ -89,8 +89,6 @@ static struct phy_driver ste10xp_pdriver[] = {
 	.features = PHY_BASIC_FEATURES | SUPPORTED_Pause,
 	.flags = PHY_HAS_INTERRUPT,
 	.config_init = ste10Xp_config_init,
-	.config_aneg = genphy_config_aneg,
-	.read_status = genphy_read_status,
 	.ack_interrupt = ste10Xp_ack_interrupt,
 	.config_intr = ste10Xp_config_intr,
 	.suspend = genphy_suspend,
@@ -102,8 +100,6 @@ static struct phy_driver ste10xp_pdriver[] = {
 	.features = PHY_BASIC_FEATURES | SUPPORTED_Pause,
 	.flags = PHY_HAS_INTERRUPT,
 	.config_init = ste10Xp_config_init,
-	.config_aneg = genphy_config_aneg,
-	.read_status = genphy_read_status,
 	.ack_interrupt = ste10Xp_ack_interrupt,
 	.config_intr = ste10Xp_config_intr,
 	.suspend = genphy_suspend,
diff --git a/drivers/net/phy/uPD60620.c b/drivers/net/phy/uPD60620.c
index 96b33475ea5e..55f48ee3595a 100644
--- a/drivers/net/phy/uPD60620.c
+++ b/drivers/net/phy/uPD60620.c
@@ -95,7 +95,6 @@ static struct phy_driver upd60620_driver[1] = { {
 	.features       = PHY_BASIC_FEATURES,
 	.flags          = 0,
 	.config_init    = upd60620_config_init,
-	.config_aneg    = genphy_config_aneg,
 	.read_status    = upd60620_read_status,
 } };
 
diff --git a/drivers/net/phy/vitesse.c b/drivers/net/phy/vitesse.c
index f78ff0279648..d9dd8fbfffc7 100644
--- a/drivers/net/phy/vitesse.c
+++ b/drivers/net/phy/vitesse.c
@@ -267,7 +267,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.flags          = PHY_HAS_INTERRUPT,
 	.config_init    = &vsc824x_config_init,
 	.config_aneg    = &vsc82x4_config_aneg,
-	.read_status    = &genphy_read_status,
 	.ack_interrupt  = &vsc824x_ack_interrupt,
 	.config_intr    = &vsc82xx_config_intr,
 }, {
@@ -278,7 +277,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= &vsc824x_config_init,
 	.config_aneg	= &vsc82x4_config_aneg,
-	.read_status	= &genphy_read_status,
 	.ack_interrupt	= &vsc824x_ack_interrupt,
 	.config_intr	= &vsc82xx_config_intr,
 }, {
@@ -289,7 +287,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= &vsc824x_config_init,
 	.config_aneg	= &vsc82x4_config_aneg,
-	.read_status	= &genphy_read_status,
 	.ack_interrupt	= &vsc824x_ack_interrupt,
 	.config_intr	= &vsc82xx_config_intr,
 }, {
@@ -300,7 +297,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.flags          = PHY_HAS_INTERRUPT,
 	.config_init    = &vsc824x_config_init,
 	.config_aneg    = &vsc82x4_config_aneg,
-	.read_status    = &genphy_read_status,
 	.ack_interrupt  = &vsc824x_ack_interrupt,
 	.config_intr    = &vsc82xx_config_intr,
 }, {
@@ -311,7 +307,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.flags          = PHY_HAS_INTERRUPT,
 	.config_init    = &vsc824x_config_init,
 	.config_aneg    = &vsc82x4_config_aneg,
-	.read_status    = &genphy_read_status,
 	.ack_interrupt  = &vsc824x_ack_interrupt,
 	.config_intr    = &vsc82xx_config_intr,
 }, {
@@ -321,8 +316,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.features       = PHY_GBIT_FEATURES,
 	.flags          = PHY_HAS_INTERRUPT,
 	.config_init    = &vsc8601_config_init,
-	.config_aneg    = &genphy_config_aneg,
-	.read_status    = &genphy_read_status,
 	.ack_interrupt  = &vsc824x_ack_interrupt,
 	.config_intr    = &vsc82xx_config_intr,
 }, {
@@ -333,7 +326,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.flags          = PHY_HAS_INTERRUPT,
 	.config_init    = &vsc824x_config_init,
 	.config_aneg    = &vsc82x4_config_aneg,
-	.read_status    = &genphy_read_status,
 	.ack_interrupt  = &vsc824x_ack_interrupt,
 	.config_intr    = &vsc82xx_config_intr,
 }, {
@@ -344,8 +336,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= &vsc8221_config_init,
-	.config_aneg	= &genphy_config_aneg,
-	.read_status	= &genphy_read_status,
 	.ack_interrupt	= &vsc824x_ack_interrupt,
 	.config_intr	= &vsc82xx_config_intr,
 }, {
@@ -356,8 +346,6 @@ static struct phy_driver vsc82xx_driver[] = {
 	.features	= PHY_GBIT_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
 	.config_init	= &vsc8221_config_init,
-	.config_aneg	= &genphy_config_aneg,
-	.read_status	= &genphy_read_status,
 	.ack_interrupt	= &vsc824x_ack_interrupt,
 	.config_intr	= &vsc82xx_config_intr,
 } };
diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index cc63102ca96e..8940417c30e5 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -731,7 +731,7 @@ static void sl_sync(void)
 
 
 /* Find a free SLIP channel, and link in this `tty' line. */
-static struct slip *sl_alloc(dev_t line)
+static struct slip *sl_alloc(void)
 {
 	int i;
 	char name[IFNAMSIZ];
@@ -809,7 +809,7 @@ static int slip_open(struct tty_struct *tty)
 
 	/* OK.  Find a free SLIP channel to use. */
 	err = -ENFILE;
-	sl = sl_alloc(tty_devnum(tty));
+	sl = sl_alloc();
 	if (sl == NULL)
 		goto err_exit;
 
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4f4a842a1c9c..e367d6310353 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -195,6 +195,11 @@ struct tun_flow_entry {
 
 #define TUN_NUM_FLOW_ENTRIES 1024
 
+struct tun_steering_prog {
+	struct rcu_head rcu;
+	struct bpf_prog *prog;
+};
+
 /* Since the socket were moved to tun_file, to preserve the behavior of persist
  * device, socket filter, sndbuf and vnet header size were restore when the
  * file were attached to a persist device.
@@ -232,6 +237,7 @@ struct tun_struct {
 	u32 rx_batched;
 	struct tun_pcpu_stats __percpu *pcpu_stats;
 	struct bpf_prog __rcu *xdp_prog;
+	struct tun_steering_prog __rcu *steering_prog;
 };
 
 static int tun_napi_receive(struct napi_struct *napi, int budget)
@@ -537,15 +543,12 @@ static inline void tun_flow_save_rps_rxhash(struct tun_flow_entry *e, u32 hash)
  * different rxq no. here. If we could not get rxhash, then we would
  * hope the rxq no. may help here.
  */
-static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb,
-			    void *accel_priv, select_queue_fallback_t fallback)
+static u16 tun_automq_select_queue(struct tun_struct *tun, struct sk_buff *skb)
 {
-	struct tun_struct *tun = netdev_priv(dev);
 	struct tun_flow_entry *e;
 	u32 txq = 0;
 	u32 numqueues = 0;
 
-	rcu_read_lock();
 	numqueues = READ_ONCE(tun->numqueues);
 
 	txq = __skb_get_hash_symmetric(skb);
@@ -563,10 +566,37 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb,
 			txq -= numqueues;
 	}
 
-	rcu_read_unlock();
 	return txq;
 }
 
+static u16 tun_ebpf_select_queue(struct tun_struct *tun, struct sk_buff *skb)
+{
+	struct tun_steering_prog *prog;
+	u16 ret = 0;
+
+	prog = rcu_dereference(tun->steering_prog);
+	if (prog)
+		ret = bpf_prog_run_clear_cb(prog->prog, skb);
+
+	return ret % tun->numqueues;
+}
+
+static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb,
+			    void *accel_priv, select_queue_fallback_t fallback)
+{
+	struct tun_struct *tun = netdev_priv(dev);
+	u16 ret;
+
+	rcu_read_lock();
+	if (rcu_dereference(tun->steering_prog))
+		ret = tun_ebpf_select_queue(tun, skb);
+	else
+		ret = tun_automq_select_queue(tun, skb);
+	rcu_read_unlock();
+
+	return ret;
+}
+
 static inline bool tun_not_capable(struct tun_struct *tun)
 {
 	const struct cred *cred = current_cred();
@@ -673,7 +703,6 @@ static void tun_detach(struct tun_file *tfile, bool clean)
 static void tun_detach_all(struct net_device *dev)
 {
 	struct tun_struct *tun = netdev_priv(dev);
-	struct bpf_prog *xdp_prog = rtnl_dereference(tun->xdp_prog);
 	struct tun_file *tfile, *tmp;
 	int i, n = tun->numqueues;
 
@@ -708,9 +737,6 @@ static void tun_detach_all(struct net_device *dev)
 	}
 	BUG_ON(tun->numdisabled != 0);
 
-	if (xdp_prog)
-		bpf_prog_put(xdp_prog);
-
 	if (tun->flags & IFF_PERSIST)
 		module_put(THIS_MODULE);
 }
@@ -937,23 +963,10 @@ static int tun_net_close(struct net_device *dev)
 }
 
 /* Net device start xmit */
-static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
+static void tun_automq_xmit(struct tun_struct *tun, struct sk_buff *skb)
 {
-	struct tun_struct *tun = netdev_priv(dev);
-	int txq = skb->queue_mapping;
-	struct tun_file *tfile;
-	u32 numqueues = 0;
-
-	rcu_read_lock();
-	tfile = rcu_dereference(tun->tfiles[txq]);
-	numqueues = READ_ONCE(tun->numqueues);
-
-	/* Drop packet if interface is not attached */
-	if (txq >= numqueues)
-		goto drop;
-
 #ifdef CONFIG_RPS
-	if (numqueues == 1 && static_key_false(&rps_needed)) {
+	if (tun->numqueues == 1 && static_key_false(&rps_needed)) {
 		/* Select queue was not called for the skbuff, so we extract the
 		 * RPS hash and save it into the flow_table here.
 		 */
@@ -969,6 +982,24 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
 		}
 	}
 #endif
+}
+
+/* Net device start xmit */
+static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct tun_struct *tun = netdev_priv(dev);
+	int txq = skb->queue_mapping;
+	struct tun_file *tfile;
+
+	rcu_read_lock();
+	tfile = rcu_dereference(tun->tfiles[txq]);
+
+	/* Drop packet if interface is not attached */
+	if (txq >= tun->numqueues)
+		goto drop;
+
+	if (!rcu_dereference(tun->steering_prog))
+		tun_automq_xmit(tun, skb);
 
 	tun_debug(KERN_INFO, tun, "tun_net_xmit %d\n", skb->len);
 
@@ -1551,7 +1582,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	int copylen;
 	bool zerocopy = false;
 	int err;
-	u32 rxhash;
+	u32 rxhash = 0;
 	int skb_xdp = 1;
 	bool frags = tun_napi_frags_enabled(tun);
 
@@ -1739,7 +1770,10 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 		rcu_read_unlock();
 	}
 
-	rxhash = __skb_get_hash_symmetric(skb);
+	rcu_read_lock();
+	if (!rcu_dereference(tun->steering_prog))
+		rxhash = __skb_get_hash_symmetric(skb);
+	rcu_read_unlock();
 
 	if (frags) {
 		/* Exercise flow dissector code path. */
@@ -1783,7 +1817,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	u64_stats_update_end(&stats->syncp);
 	put_cpu_ptr(stats);
 
-	tun_flow_update(tun, rxhash, tfile);
+	if (rxhash)
+		tun_flow_update(tun, rxhash, tfile);
+
 	return total_len;
 }
 
@@ -1991,6 +2027,39 @@ static ssize_t tun_chr_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return ret;
 }
 
+static void tun_steering_prog_free(struct rcu_head *rcu)
+{
+	struct tun_steering_prog *prog = container_of(rcu,
+					 struct tun_steering_prog, rcu);
+
+	bpf_prog_destroy(prog->prog);
+	kfree(prog);
+}
+
+static int __tun_set_steering_ebpf(struct tun_struct *tun,
+				   struct bpf_prog *prog)
+{
+	struct tun_steering_prog *old, *new = NULL;
+
+	if (prog) {
+		new = kmalloc(sizeof(*new), GFP_KERNEL);
+		if (!new)
+			return -ENOMEM;
+		new->prog = prog;
+	}
+
+	spin_lock_bh(&tun->lock);
+	old = rcu_dereference_protected(tun->steering_prog,
+					lockdep_is_held(&tun->lock));
+	rcu_assign_pointer(tun->steering_prog, new);
+	spin_unlock_bh(&tun->lock);
+
+	if (old)
+		call_rcu(&old->rcu, tun_steering_prog_free);
+
+	return 0;
+}
+
 static void tun_free_netdev(struct net_device *dev)
 {
 	struct tun_struct *tun = netdev_priv(dev);
@@ -1999,6 +2068,7 @@ static void tun_free_netdev(struct net_device *dev)
 	free_percpu(tun->pcpu_stats);
 	tun_flow_uninit(tun);
 	security_tun_dev_free_security(tun->security);
+	__tun_set_steering_ebpf(tun, NULL);
 }
 
 static void tun_setup(struct net_device *dev)
@@ -2287,6 +2357,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 		tun->filter_attached = false;
 		tun->sndbuf = tfile->socket.sk->sk_sndbuf;
 		tun->rx_batched = 0;
+		RCU_INIT_POINTER(tun->steering_prog, NULL);
 
 		tun->pcpu_stats = netdev_alloc_pcpu_stats(struct tun_pcpu_stats);
 		if (!tun->pcpu_stats) {
@@ -2479,6 +2550,25 @@ unlock:
 	return ret;
 }
 
+static int tun_set_steering_ebpf(struct tun_struct *tun, void __user *data)
+{
+	struct bpf_prog *prog;
+	int fd;
+
+	if (copy_from_user(&fd, data, sizeof(fd)))
+		return -EFAULT;
+
+	if (fd == -1) {
+		prog = NULL;
+	} else {
+		prog = bpf_prog_get_type(fd, BPF_PROG_TYPE_SOCKET_FILTER);
+		if (IS_ERR(prog))
+			return PTR_ERR(prog);
+	}
+
+	return __tun_set_steering_ebpf(tun, prog);
+}
+
 static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 			    unsigned long arg, int ifreq_len)
 {
@@ -2755,6 +2845,10 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 		ret = 0;
 		break;
 
+	case TUNSETSTEERINGEBPF:
+		ret = tun_set_steering_ebpf(tun, argp);
+		break;
+
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index f5438d0978ca..a69ad39ee57e 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -410,6 +410,9 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
 	if (ifmp && (dev->ifindex != 0))
 		peer->ifindex = ifmp->ifi_index;
 
+	peer->gso_max_size = dev->gso_max_size;
+	peer->gso_max_segs = dev->gso_max_segs;
+
 	err = register_netdevice(peer);
 	put_net(net);
 	net = NULL;
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 559b215c0169..6fb7b658a6cc 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -261,9 +261,12 @@ static void virtqueue_napi_complete(struct napi_struct *napi,
 	int opaque;
 
 	opaque = virtqueue_enable_cb_prepare(vq);
-	if (napi_complete_done(napi, processed) &&
-	    unlikely(virtqueue_poll(vq, opaque)))
-		virtqueue_napi_schedule(napi, vq);
+	if (napi_complete_done(napi, processed)) {
+		if (unlikely(virtqueue_poll(vq, opaque)))
+			virtqueue_napi_schedule(napi, vq);
+	} else {
+		virtqueue_disable_cb(vq);
+	}
 }
 
 static void skb_xmit_done(struct virtqueue *vq)
diff --git a/drivers/net/vmxnet3/vmxnet3_int.h b/drivers/net/vmxnet3/vmxnet3_int.h
index 9c51b8be0038..5ba222920e80 100644
--- a/drivers/net/vmxnet3/vmxnet3_int.h
+++ b/drivers/net/vmxnet3/vmxnet3_int.h
@@ -69,10 +69,10 @@
 /*
  * Version numbers
  */
-#define VMXNET3_DRIVER_VERSION_STRING   "1.4.a.0-k"
+#define VMXNET3_DRIVER_VERSION_STRING   "1.4.11.0-k"
 
 /* a 32-bit int, each byte encode a verion number in VMXNET3_DRIVER_VERSION */
-#define VMXNET3_DRIVER_VERSION_NUM      0x01040a00
+#define VMXNET3_DRIVER_VERSION_NUM      0x01040b00
 
 #if defined(CONFIG_PCI_MSI)
 	/* RSS only makes sense if MSI-X is supported. */
@@ -416,8 +416,8 @@ struct vmxnet3_adapter {
 
 /* must be a multiple of VMXNET3_RING_SIZE_ALIGN */
 #define VMXNET3_DEF_TX_RING_SIZE    512
-#define VMXNET3_DEF_RX_RING_SIZE    256
-#define VMXNET3_DEF_RX_RING2_SIZE   128
+#define VMXNET3_DEF_RX_RING_SIZE    1024
+#define VMXNET3_DEF_RX_RING2_SIZE   256
 
 #define VMXNET3_DEF_RXDATA_DESC_SIZE 128
 
diff --git a/include/linux/dsa/lan9303.h b/include/linux/dsa/lan9303.h
index f48a85c377de..b6514c29563f 100644
--- a/include/linux/dsa/lan9303.h
+++ b/include/linux/dsa/lan9303.h
@@ -26,6 +26,7 @@ struct lan9303 {
 	bool phy_addr_sel_strap;
 	struct dsa_switch *ds;
 	struct mutex indirect_mutex; /* protect indexed register access */
+	struct mutex alr_mutex; /* protect ALR access */
 	const struct lan9303_phy_ops *ops;
 	bool is_bridged; /* true if port 1 and 2 are bridged */
 
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 80b5b482cb46..0062302e1285 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -985,6 +985,7 @@ struct bpf_sock_ops_kern {
 		u32 reply;
 		u32 replylong[4];
 	};
+	u32	is_fullsock;
 };
 
 #endif /* __LINUX_FILTER_H__ */
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 6c9336626592..93bd6fcd6e62 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -127,28 +127,6 @@ struct hv_ring_buffer_info {
 	u32 priv_read_index;
 };
 
-/*
- *
- * hv_get_ringbuffer_availbytes()
- *
- * Get number of bytes available to read and to write to
- * for the specified ring buffer
- */
-static inline void
-hv_get_ringbuffer_availbytes(const struct hv_ring_buffer_info *rbi,
-			     u32 *read, u32 *write)
-{
-	u32 read_loc, write_loc, dsize;
-
-	/* Capture the read/write indices before they changed */
-	read_loc = rbi->ring_buffer->read_index;
-	write_loc = rbi->ring_buffer->write_index;
-	dsize = rbi->ring_datasize;
-
-	*write = write_loc >= read_loc ? dsize - (write_loc - read_loc) :
-		read_loc - write_loc;
-	*read = dsize - *write;
-}
 
 static inline u32 hv_get_bytes_to_read(const struct hv_ring_buffer_info *rbi)
 {
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
index bedf54b6f943..4cb7aeeafce0 100644
--- a/include/linux/if_macvlan.h
+++ b/include/linux/if_macvlan.h
@@ -30,10 +30,10 @@ struct macvlan_dev {
 	enum macvlan_mode	mode;
 	u16			flags;
 	int			nest_level;
+	unsigned int		macaddr_count;
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	struct netpoll		*netpoll;
 #endif
-	unsigned int		macaddr_count;
 };
 
 static inline void macvlan_count_rx(const struct macvlan_dev *vlan,
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index ca08ab16ecdc..92d4e55ffe67 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -12,6 +12,7 @@
 #include <uapi/linux/mdio.h>
 #include <linux/mod_devicetable.h>
 
+struct gpio_desc;
 struct mii_bus;
 
 /* Multiple levels of nesting are possible. However typically this is
@@ -39,6 +40,7 @@ struct mdio_device {
 	/* Bus address of the MDIO device (0-31) */
 	int addr;
 	int flags;
+	struct gpio_desc *reset;
 };
 #define to_mdio_device(d) container_of(d, struct mdio_device, dev)
 
@@ -71,6 +73,7 @@ void mdio_device_free(struct mdio_device *mdiodev);
 struct mdio_device *mdio_device_create(struct mii_bus *bus, int addr);
 int mdio_device_register(struct mdio_device *mdiodev);
 void mdio_device_remove(struct mdio_device *mdiodev);
+void mdio_device_reset(struct mdio_device *mdiodev, int value);
 int mdio_driver_register(struct mdio_driver *drv);
 void mdio_driver_unregister(struct mdio_driver *drv);
 int mdio_device_bus_match(struct device *dev, struct device_driver *drv);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ef789e1d679e..cc4ce7456e38 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -820,6 +820,8 @@ struct netdev_bpf {
 		struct {
 			u8 prog_attached;
 			u32 prog_id;
+			/* flags with which program was installed */
+			u32 prog_flags;
 		};
 		/* BPF_OFFLOAD_VERIFIER_PREP */
 		struct {
@@ -3330,7 +3332,8 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 typedef int (*bpf_op_t)(struct net_device *dev, struct netdev_bpf *bpf);
 int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 		      int fd, u32 flags);
-u8 __dev_xdp_attached(struct net_device *dev, bpf_op_t xdp_op, u32 *prog_id);
+void __dev_xdp_query(struct net_device *dev, bpf_op_t xdp_op,
+		     struct netdev_bpf *xdp);
 
 int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb);
 int dev_forward_skb(struct net_device *dev, struct sk_buff *skb);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index dc82a07cb4fd..d3037e2ffbc4 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -468,7 +468,6 @@ struct phy_device {
 	/* Interrupt and Polling infrastructure */
 	struct work_struct phy_queue;
 	struct delayed_work state_queue;
-	atomic_t irq_disable;
 
 	struct mutex lock;
 
@@ -497,19 +496,19 @@ struct phy_device {
  * flags: A bitfield defining certain other features this PHY
  *   supports (like interrupts)
  *
- * The drivers must implement config_aneg and read_status.  All
- * other functions are optional. Note that none of these
- * functions should be called from interrupt time.  The goal is
- * for the bus read/write functions to be able to block when the
- * bus transaction is happening, and be freed up by an interrupt
- * (The MPC85xx has this ability, though it is not currently
- * supported in the driver).
+ * All functions are optional. If config_aneg or read_status
+ * are not implemented, the phy core uses the genphy versions.
+ * Note that none of these functions should be called from
+ * interrupt time. The goal is for the bus read/write functions
+ * to be able to block when the bus transaction is happening,
+ * and be freed up by an interrupt (The MPC85xx has this ability,
+ * though it is not currently supported in the driver).
  */
 struct phy_driver {
 	struct mdio_driver_common mdiodrv;
 	u32 phy_id;
 	char *name;
-	unsigned int phy_id_mask;
+	u32 phy_id_mask;
 	u32 features;
 	u32 flags;
 	const void *driver_data;
@@ -763,6 +762,20 @@ static inline bool phy_interface_mode_is_rgmii(phy_interface_t mode)
 };
 
 /**
+ * phy_interface_mode_is_8023z() - does the phy interface mode use 802.3z
+ *   negotiation
+ * @mode: one of &enum phy_interface_t
+ *
+ * Returns true if the phy interface mode uses the 16-bit negotiation
+ * word as defined in 802.3z. (See 802.3-2015 37.2.1 Config_Reg encoding)
+ */
+static inline bool phy_interface_mode_is_8023z(phy_interface_t mode)
+{
+	return mode == PHY_INTERFACE_MODE_1000BASEX ||
+	       mode == PHY_INTERFACE_MODE_2500BASEX;
+}
+
+/**
  * phy_interface_is_rgmii - Convenience function for testing if a PHY interface
  * is RGMII (all variants)
  * @phydev: the phy_device struct
@@ -841,12 +854,9 @@ int phy_aneg_done(struct phy_device *phydev);
 int phy_stop_interrupts(struct phy_device *phydev);
 int phy_restart_aneg(struct phy_device *phydev);
 
-static inline int phy_read_status(struct phy_device *phydev)
+static inline void phy_device_reset(struct phy_device *phydev, int value)
 {
-	if (!phydev->drv)
-		return -EIO;
-
-	return phydev->drv->read_status(phydev);
+	mdio_device_reset(&phydev->mdio, value);
 }
 
 #define phydev_err(_phydev, format, args...)	\
@@ -890,6 +900,17 @@ int genphy_c45_read_pma(struct phy_device *phydev);
 int genphy_c45_pma_setup_forced(struct phy_device *phydev);
 int genphy_c45_an_disable_aneg(struct phy_device *phydev);
 
+static inline int phy_read_status(struct phy_device *phydev)
+{
+	if (!phydev->drv)
+		return -EIO;
+
+	if (phydev->drv->read_status)
+		return phydev->drv->read_status(phydev);
+	else
+		return genphy_read_status(phydev);
+}
+
 void phy_driver_unregister(struct phy_driver *drv);
 void phy_drivers_unregister(struct phy_driver *drv, int n);
 int phy_driver_register(struct phy_driver *new_driver, struct module *owner);
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index af67edd4ae38..4f0f452ff38d 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -7,6 +7,7 @@
 
 struct device_node;
 struct ethtool_cmd;
+struct fwnode_handle;
 struct net_device;
 
 enum {
@@ -20,19 +21,31 @@ enum {
 
 	MLO_AN_PHY = 0,	/* Conventional PHY */
 	MLO_AN_FIXED,	/* Fixed-link mode */
-	MLO_AN_SGMII,	/* Cisco SGMII protocol */
-	MLO_AN_8023Z,	/* 1000base-X protocol */
+	MLO_AN_INBAND,	/* In-band protocol */
 };
 
 static inline bool phylink_autoneg_inband(unsigned int mode)
 {
-	return mode == MLO_AN_SGMII || mode == MLO_AN_8023Z;
+	return mode == MLO_AN_INBAND;
 }
 
+/**
+ * struct phylink_link_state - link state structure
+ * @advertising: ethtool bitmask containing advertised link modes
+ * @lp_advertising: ethtool bitmask containing link partner advertised link
+ *   modes
+ * @interface: link &typedef phy_interface_t mode
+ * @speed: link speed, one of the SPEED_* constants.
+ * @duplex: link duplex mode, one of DUPLEX_* constants.
+ * @pause: link pause state, described by MLO_PAUSE_* constants.
+ * @link: true if the link is up.
+ * @an_enabled: true if autonegotiation is enabled/desired.
+ * @an_complete: true if autonegotiation has completed.
+ */
 struct phylink_link_state {
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising);
-	phy_interface_t interface;	/* PHY_INTERFACE_xxx */
+	phy_interface_t interface;
 	int speed;
 	int duplex;
 	int pause;
@@ -41,66 +54,136 @@ struct phylink_link_state {
 	unsigned int an_complete:1;
 };
 
+/**
+ * struct phylink_mac_ops - MAC operations structure.
+ * @validate: Validate and update the link configuration.
+ * @mac_link_state: Read the current link state from the hardware.
+ * @mac_config: configure the MAC for the selected mode and state.
+ * @mac_an_restart: restart 802.3z BaseX autonegotiation.
+ * @mac_link_down: take the link down.
+ * @mac_link_up: allow the link to come up.
+ *
+ * The individual methods are described more fully below.
+ */
 struct phylink_mac_ops {
-	/**
-	 * validate: validate and update the link configuration
-	 * @ndev: net_device structure associated with MAC
-	 * @config: configuration to validate
-	 *
-	 * Update the %config->supported and %config->advertised masks
-	 * clearing bits that can not be supported.
-	 *
-	 * Note: the PHY may be able to transform from one connection
-	 * technology to another, so, eg, don't clear 1000BaseX just
-	 * because the MAC is unable to support it.  This is more about
-	 * clearing unsupported speeds and duplex settings.
-	 *
-	 * If the %config->interface mode is %PHY_INTERFACE_MODE_1000BASEX
-	 * or %PHY_INTERFACE_MODE_2500BASEX, select the appropriate mode
-	 * based on %config->advertised and/or %config->speed.
-	 */
 	void (*validate)(struct net_device *ndev, unsigned long *supported,
 			 struct phylink_link_state *state);
-
-	/* Read the current link state from the hardware */
-	int (*mac_link_state)(struct net_device *, struct phylink_link_state *);
-
-	/* Configure the MAC */
-	/**
-	 * mac_config: configure the MAC for the selected mode and state
-	 * @ndev: net_device structure for the MAC
-	 * @mode: one of MLO_AN_FIXED, MLO_AN_PHY, MLO_AN_8023Z, MLO_AN_SGMII
-	 * @state: state structure
-	 *
-	 * The action performed depends on the currently selected mode:
-	 *
-	 * %MLO_AN_FIXED, %MLO_AN_PHY:
-	 *   set the specified speed, duplex, pause mode, and phy interface
-	 *   mode in the provided @state.
-	 * %MLO_AN_8023Z:
-	 *   place the link in 1000base-X mode, advertising the parameters
-	 *   given in advertising in @state.
-	 * %MLO_AN_SGMII:
-	 *   place the link in Cisco SGMII mode - there is no advertisment
-	 *   to make as the PHY communicates the speed and duplex to the
-	 *   MAC over the in-band control word.  Configuration of the pause
-	 *   mode is as per MLO_AN_PHY since this is not included.
-	 */
+	int (*mac_link_state)(struct net_device *ndev,
+			      struct phylink_link_state *state);
 	void (*mac_config)(struct net_device *ndev, unsigned int mode,
 			   const struct phylink_link_state *state);
-
-	/**
-	 * mac_an_restart: restart 802.3z BaseX autonegotiation
-	 * @ndev: net_device structure for the MAC
-	 */
 	void (*mac_an_restart)(struct net_device *ndev);
-
-	void (*mac_link_down)(struct net_device *, unsigned int mode);
-	void (*mac_link_up)(struct net_device *, unsigned int mode,
-			    struct phy_device *);
+	void (*mac_link_down)(struct net_device *ndev, unsigned int mode);
+	void (*mac_link_up)(struct net_device *ndev, unsigned int mode,
+			    struct phy_device *phy);
 };
 
-struct phylink *phylink_create(struct net_device *, struct device_node *,
+#if 0 /* For kernel-doc purposes only. */
+/**
+ * validate - Validate and update the link configuration
+ * @ndev: a pointer to a &struct net_device for the MAC.
+ * @supported: ethtool bitmask for supported link modes.
+ * @state: a pointer to a &struct phylink_link_state.
+ *
+ * Clear bits in the @supported and @state->advertising masks that
+ * are not supportable by the MAC.
+ *
+ * Note that the PHY may be able to transform from one connection
+ * technology to another, so, eg, don't clear 1000BaseX just
+ * because the MAC is unable to BaseX mode. This is more about
+ * clearing unsupported speeds and duplex settings.
+ *
+ * If the @state->interface mode is %PHY_INTERFACE_MODE_1000BASEX
+ * or %PHY_INTERFACE_MODE_2500BASEX, select the appropriate mode
+ * based on @state->advertising and/or @state->speed and update
+ * @state->interface accordingly.
+ */
+void validate(struct net_device *ndev, unsigned long *supported,
+	      struct phylink_link_state *state);
+
+/**
+ * mac_link_state() - Read the current link state from the hardware
+ * @ndev: a pointer to a &struct net_device for the MAC.
+ * @state: a pointer to a &struct phylink_link_state.
+ *
+ * Read the current link state from the MAC, reporting the current
+ * speed in @state->speed, duplex mode in @state->duplex, pause mode
+ * in @state->pause using the %MLO_PAUSE_RX and %MLO_PAUSE_TX bits,
+ * negotiation completion state in @state->an_complete, and link
+ * up state in @state->link.
+ */
+int mac_link_state(struct net_device *ndev,
+		   struct phylink_link_state *state);
+
+/**
+ * mac_config() - configure the MAC for the selected mode and state
+ * @ndev: a pointer to a &struct net_device for the MAC.
+ * @mode: one of %MLO_AN_FIXED, %MLO_AN_PHY, %MLO_AN_INBAND.
+ * @state: a pointer to a &struct phylink_link_state.
+ *
+ * The action performed depends on the currently selected mode:
+ *
+ * %MLO_AN_FIXED, %MLO_AN_PHY:
+ *   Configure the specified @state->speed, @state->duplex and
+ *   @state->pause (%MLO_PAUSE_TX / %MLO_PAUSE_RX) mode.
+ *
+ * %MLO_AN_INBAND:
+ *   place the link in an inband negotiation mode (such as 802.3z
+ *   1000base-X or Cisco SGMII mode depending on the @state->interface
+ *   mode). In both cases, link state management (whether the link
+ *   is up or not) is performed by the MAC, and reported via the
+ *   mac_link_state() callback. Changes in link state must be made
+ *   by calling phylink_mac_change().
+ *
+ *   If in 802.3z mode, the link speed is fixed, dependent on the
+ *   @state->interface. Duplex is negotiated, and pause is advertised
+ *   according to @state->an_enabled, @state->pause and
+ *   @state->advertising flags. Beware of MACs which only support full
+ *   duplex at gigabit and higher speeds.
+ *
+ *   If in Cisco SGMII mode, the link speed and duplex mode are passed
+ *   in the serial bitstream 16-bit configuration word, and the MAC
+ *   should be configured to read these bits and acknowledge the
+ *   configuration word. Nothing is advertised by the MAC. The MAC is
+ *   responsible for reading the configuration word and configuring
+ *   itself accordingly.
+ */
+void mac_config(struct net_device *ndev, unsigned int mode,
+		const struct phylink_link_state *state);
+
+/**
+ * mac_an_restart() - restart 802.3z BaseX autonegotiation
+ * @ndev: a pointer to a &struct net_device for the MAC.
+ */
+void mac_an_restart(struct net_device *ndev);
+
+/**
+ * mac_link_down() - take the link down
+ * @ndev: a pointer to a &struct net_device for the MAC.
+ * @mode: link autonegotiation mode
+ *
+ * If @mode is not an in-band negotiation mode (as defined by
+ * phylink_autoneg_inband()), force the link down and disable any
+ * Energy Efficient Ethernet MAC configuration.
+ */
+void mac_link_down(struct net_device *ndev, unsigned int mode);
+
+/**
+ * mac_link_up() - allow the link to come up
+ * @ndev: a pointer to a &struct net_device for the MAC.
+ * @mode: link autonegotiation mode
+ * @phy: any attached phy
+ *
+ * If @mode is not an in-band negotiation mode (as defined by
+ * phylink_autoneg_inband()), allow the link to come up. If @phy
+ * is non-%NULL, configure Energy Efficient Ethernet by calling
+ * phy_init_eee() and perform appropriate MAC configuration for EEE.
+ */
+void mac_link_up(struct net_device *ndev, unsigned int mode,
+		 struct phy_device *phy);
+#endif
+
+struct phylink *phylink_create(struct net_device *, struct fwnode_handle *,
 	phy_interface_t iface, const struct phylink_mac_ops *ops);
 void phylink_destroy(struct phylink *);
 
@@ -128,7 +211,6 @@ int phylink_ethtool_set_pauseparam(struct phylink *,
 int phylink_ethtool_get_module_info(struct phylink *, struct ethtool_modinfo *);
 int phylink_ethtool_get_module_eeprom(struct phylink *,
 				      struct ethtool_eeprom *, u8 *);
-int phylink_init_eee(struct phylink *, bool);
 int phylink_get_eee_err(struct phylink *);
 int phylink_ethtool_get_eee(struct phylink *, struct ethtool_eee *);
 int phylink_ethtool_set_eee(struct phylink *, struct ethtool_eee *);
diff --git a/include/linux/sfp.h b/include/linux/sfp.h
index 4a906f560817..47ea32d3e816 100644
--- a/include/linux/sfp.h
+++ b/include/linux/sfp.h
@@ -3,7 +3,7 @@
 
 #include <linux/phy.h>
 
-struct __packed sfp_eeprom_base {
+struct sfp_eeprom_base {
 	u8 phys_id;
 	u8 phys_ext_id;
 	u8 connector;
@@ -166,12 +166,12 @@ struct __packed sfp_eeprom_base {
 	union {
 		__be16 optical_wavelength;
 		u8 cable_spec;
-	};
+	} __packed;
 	u8 reserved62;
 	u8 cc_base;
-};
+} __packed;
 
-struct __packed sfp_eeprom_ext {
+struct sfp_eeprom_ext {
 	__be16 options;
 	u8 br_max;
 	u8 br_min;
@@ -181,12 +181,21 @@ struct __packed sfp_eeprom_ext {
 	u8 enhopts;
 	u8 sff8472_compliance;
 	u8 cc_ext;
-};
-
-struct __packed sfp_eeprom_id {
+} __packed;
+
+/**
+ * struct sfp_eeprom_id - raw SFP module identification information
+ * @base: base SFP module identification structure
+ * @ext: extended SFP module identification structure
+ *
+ * See the SFF-8472 specification and related documents for the definition
+ * of these structure members. This can be obtained from
+ * ftp://ftp.seagate.com/sff
+ */
+struct sfp_eeprom_id {
 	struct sfp_eeprom_base base;
 	struct sfp_eeprom_ext ext;
-};
+} __packed;
 
 /* SFP EEPROM registers */
 enum {
@@ -347,19 +356,32 @@ enum {
 	SFP_PAGE			= 0x7f,
 };
 
-struct device_node;
+struct fwnode_handle;
 struct ethtool_eeprom;
 struct ethtool_modinfo;
 struct net_device;
 struct sfp_bus;
 
+/**
+ * struct sfp_upstream_ops - upstream operations structure
+ * @module_insert: called after a module has been detected to determine
+ *   whether the module is supported for the upstream device.
+ * @module_remove: called after the module has been removed.
+ * @link_down: called when the link is non-operational for whatever
+ *   reason.
+ * @link_up: called when the link is operational.
+ * @connect_phy: called when an I2C accessible PHY has been detected
+ *   on the module.
+ * @disconnect_phy: called when a module with an I2C accessible PHY has
+ *   been removed.
+ */
 struct sfp_upstream_ops {
-	int (*module_insert)(void *, const struct sfp_eeprom_id *id);
-	void (*module_remove)(void *);
-	void (*link_down)(void *);
-	void (*link_up)(void *);
-	int (*connect_phy)(void *, struct phy_device *);
-	void (*disconnect_phy)(void *);
+	int (*module_insert)(void *priv, const struct sfp_eeprom_id *id);
+	void (*module_remove)(void *priv);
+	void (*link_down)(void *priv);
+	void (*link_up)(void *priv);
+	int (*connect_phy)(void *priv, struct phy_device *);
+	void (*disconnect_phy)(void *priv);
 };
 
 #if IS_ENABLED(CONFIG_SFP)
@@ -375,7 +397,7 @@ int sfp_get_module_eeprom(struct sfp_bus *bus, struct ethtool_eeprom *ee,
 			  u8 *data);
 void sfp_upstream_start(struct sfp_bus *bus);
 void sfp_upstream_stop(struct sfp_bus *bus);
-struct sfp_bus *sfp_register_upstream(struct device_node *np,
+struct sfp_bus *sfp_register_upstream(struct fwnode_handle *fwnode,
 				      struct net_device *ndev, void *upstream,
 				      const struct sfp_upstream_ops *ops);
 void sfp_unregister_upstream(struct sfp_bus *bus);
@@ -419,7 +441,8 @@ static inline void sfp_upstream_stop(struct sfp_bus *bus)
 {
 }
 
-static inline struct sfp_bus *sfp_register_upstream(struct device_node *np,
+static inline struct sfp_bus *sfp_register_upstream(
+	struct fwnode_handle *fwnode,
 	struct net_device *ndev, void *upstream,
 	const struct sfp_upstream_ops *ops)
 {
diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index 8621ffdeecbf..c7addf37d119 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -72,6 +72,11 @@ static inline bool __skb_array_empty(struct skb_array *a)
 	return !__ptr_ring_peek(&a->ring);
 }
 
+static inline struct sk_buff *__skb_array_peek(struct skb_array *a)
+{
+	return __ptr_ring_peek(&a->ring);
+}
+
 static inline bool skb_array_empty(struct skb_array *a)
 {
 	return ptr_ring_empty(&a->ring);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a38c80e9f91e..b8e0da6c27d6 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1211,6 +1211,11 @@ static inline bool skb_flow_dissect_flow_keys_buf(struct flow_keys *flow,
 				  data, proto, nhoff, hlen, flags);
 }
 
+void
+skb_flow_dissect_tunnel_info(const struct sk_buff *skb,
+			     struct flow_dissector *flow_dissector,
+			     void *target_container);
+
 static inline __u32 skb_get_hash(struct sk_buff *skb)
 {
 	if (!skb->l4_hash && !skb->sw_hash)
diff --git a/include/net/act_api.h b/include/net/act_api.h
index fd08df74c466..02bf409140d0 100644
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -86,7 +86,7 @@ struct tc_action_ops {
 	int     (*act)(struct sk_buff *, const struct tc_action *,
 		       struct tcf_result *);
 	int     (*dump)(struct sk_buff *, struct tc_action *, int, int);
-	void	(*cleanup)(struct tc_action *, int bind);
+	void	(*cleanup)(struct tc_action *);
 	int     (*lookup)(struct net *, struct tc_action **, u32);
 	int     (*init)(struct net *net, struct nlattr *nla,
 			struct nlattr *est, struct tc_action **act, int ovr,
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index b623b65a79d1..c4185a7b0e90 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -180,7 +180,7 @@ static inline int addrconf_finite_timeout(unsigned long timeout)
  */
 int ipv6_addr_label_init(void);
 void ipv6_addr_label_cleanup(void);
-void ipv6_addr_label_rtnl_register(void);
+int ipv6_addr_label_rtnl_register(void);
 u32 ipv6_addr_label(struct net *net, const struct in6_addr *addr,
 		    int type, int ifindex);
 
diff --git a/include/net/dn_route.h b/include/net/dn_route.h
index 55df9939bca2..342d2503cba5 100644
--- a/include/net/dn_route.h
+++ b/include/net/dn_route.h
@@ -69,6 +69,7 @@ int dn_route_rcv(struct sk_buff *skb, struct net_device *dev,
  */
 struct dn_route {
 	struct dst_entry dst;
+	struct dn_route __rcu *dn_next;
 
 	struct neighbour *n;
 
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 2a05738570d8..6cb602dd970c 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -296,31 +296,39 @@ static inline u32 dsa_user_ports(struct dsa_switch *ds)
 	return mask;
 }
 
-static inline u8 dsa_upstream_port(struct dsa_switch *ds)
+/* Return the local port used to reach an arbitrary switch port */
+static inline unsigned int dsa_towards_port(struct dsa_switch *ds, int device,
+					    int port)
 {
-	struct dsa_switch_tree *dst = ds->dst;
-
-	/*
-	 * If this is the root switch (i.e. the switch that connects
-	 * to the CPU), return the cpu port number on this switch.
-	 * Else return the (DSA) port number that connects to the
-	 * switch that is one hop closer to the cpu.
-	 */
-	if (dst->cpu_dp->ds == ds)
-		return dst->cpu_dp->index;
+	if (device == ds->index)
+		return port;
 	else
-		return ds->rtable[dst->cpu_dp->ds->index];
+		return ds->rtable[device];
+}
+
+/* Return the local port used to reach the dedicated CPU port */
+static inline unsigned int dsa_upstream_port(struct dsa_switch *ds, int port)
+{
+	const struct dsa_port *dp = dsa_to_port(ds, port);
+	const struct dsa_port *cpu_dp = dp->cpu_dp;
+
+	if (!cpu_dp)
+		return port;
+
+	return dsa_towards_port(ds, cpu_dp->ds->index, cpu_dp->index);
 }
 
 typedef int dsa_fdb_dump_cb_t(const unsigned char *addr, u16 vid,
 			      bool is_static, void *data);
 struct dsa_switch_ops {
+#if IS_ENABLED(CONFIG_NET_DSA_LEGACY)
 	/*
 	 * Legacy probing.
 	 */
 	const char	*(*probe)(struct device *dsa_dev,
 				  struct device *host_dev, int sw_addr,
 				  void **priv);
+#endif
 
 	enum dsa_tag_protocol (*get_tag_protocol)(struct dsa_switch *ds,
 						  int port);
@@ -412,12 +420,10 @@ struct dsa_switch_ops {
 	 */
 	int	(*port_vlan_filtering)(struct dsa_switch *ds, int port,
 				       bool vlan_filtering);
-	int	(*port_vlan_prepare)(struct dsa_switch *ds, int port,
-				     const struct switchdev_obj_port_vlan *vlan,
-				     struct switchdev_trans *trans);
-	void	(*port_vlan_add)(struct dsa_switch *ds, int port,
-				 const struct switchdev_obj_port_vlan *vlan,
-				 struct switchdev_trans *trans);
+	int (*port_vlan_prepare)(struct dsa_switch *ds, int port,
+				 const struct switchdev_obj_port_vlan *vlan);
+	void (*port_vlan_add)(struct dsa_switch *ds, int port,
+			      const struct switchdev_obj_port_vlan *vlan);
 	int	(*port_vlan_del)(struct dsa_switch *ds, int port,
 				 const struct switchdev_obj_port_vlan *vlan);
 	/*
@@ -433,12 +439,10 @@ struct dsa_switch_ops {
 	/*
 	 * Multicast database
 	 */
-	int	(*port_mdb_prepare)(struct dsa_switch *ds, int port,
-				    const struct switchdev_obj_port_mdb *mdb,
-				    struct switchdev_trans *trans);
-	void	(*port_mdb_add)(struct dsa_switch *ds, int port,
-				const struct switchdev_obj_port_mdb *mdb,
-				struct switchdev_trans *trans);
+	int (*port_mdb_prepare)(struct dsa_switch *ds, int port,
+				const struct switchdev_obj_port_mdb *mdb);
+	void (*port_mdb_add)(struct dsa_switch *ds, int port,
+			     const struct switchdev_obj_port_mdb *mdb);
 	int	(*port_mdb_del)(struct dsa_switch *ds, int port,
 				const struct switchdev_obj_port_mdb *mdb);
 	/*
@@ -472,11 +476,20 @@ struct dsa_switch_driver {
 	const struct dsa_switch_ops *ops;
 };
 
+#if IS_ENABLED(CONFIG_NET_DSA_LEGACY)
 /* Legacy driver registration */
 void register_switch_driver(struct dsa_switch_driver *type);
 void unregister_switch_driver(struct dsa_switch_driver *type);
 struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev);
 
+#else
+static inline void register_switch_driver(struct dsa_switch_driver *type) { }
+static inline void unregister_switch_driver(struct dsa_switch_driver *type) { }
+static inline struct mii_bus *dsa_host_dev_to_mii_bus(struct device *dev)
+{
+	return NULL;
+}
+#endif
 struct net_device *dsa_dev_to_net_device(struct device *dev);
 
 /* Keep inline for faster access in hot path */
diff --git a/include/net/dst.h b/include/net/dst.h
index b091fd536098..33d2a5433924 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -34,13 +34,9 @@ struct sk_buff;
 
 struct dst_entry {
 	struct net_device       *dev;
-	struct rcu_head		rcu_head;
-	struct dst_entry	*child;
 	struct  dst_ops	        *ops;
 	unsigned long		_metrics;
 	unsigned long           expires;
-	struct dst_entry	*path;
-	struct dst_entry	*from;
 #ifdef CONFIG_XFRM
 	struct xfrm_state	*xfrm;
 #else
@@ -59,8 +55,6 @@ struct dst_entry {
 #define DST_XFRM_QUEUE		0x0040
 #define DST_METADATA		0x0080
 
-	short			error;
-
 	/* A non-zero value of dst->obsolete forces by-hand validation
 	 * of the route entry.  Positive values are set by the generic
 	 * dst layer to indicate that the entry has been forcefully
@@ -76,35 +70,24 @@ struct dst_entry {
 #define DST_OBSOLETE_KILL	-2
 	unsigned short		header_len;	/* more space at head required */
 	unsigned short		trailer_len;	/* space to reserve at tail */
-	unsigned short		__pad3;
 
-#ifdef CONFIG_IP_ROUTE_CLASSID
-	__u32			tclassid;
-#else
-	__u32			__pad2;
-#endif
-
-#ifdef CONFIG_64BIT
-	/*
-	 * Align __refcnt to a 64 bytes alignment
-	 * (L1_CACHE_SIZE would be too much)
-	 */
-	long			__pad_to_align_refcnt[2];
-#endif
 	/*
 	 * __refcnt wants to be on a different cache line from
 	 * input/output/ops or performance tanks badly
 	 */
-	atomic_t		__refcnt;	/* client references	*/
+#ifdef CONFIG_64BIT
+	atomic_t		__refcnt;	/* 64-bit offset 64 */
+#endif
 	int			__use;
 	unsigned long		lastuse;
 	struct lwtunnel_state   *lwtstate;
-	union {
-		struct dst_entry	*next;
-		struct rtable __rcu	*rt_next;
-		struct rt6_info __rcu	*rt6_next;
-		struct dn_route __rcu	*dn_next;
-	};
+	struct rcu_head		rcu_head;
+	short			error;
+	short			__pad;
+	__u32			tclassid;
+#ifndef CONFIG_64BIT
+	atomic_t		__refcnt;	/* 32-bit offset 64 */
+#endif
 };
 
 struct dst_metrics {
@@ -250,7 +233,7 @@ static inline void dst_hold(struct dst_entry *dst)
 {
 	/*
 	 * If your kernel compilation stops here, please check
-	 * __pad_to_align_refcnt declaration in struct dst_entry
+	 * the placement of __refcnt in struct dst_entry
 	 */
 	BUILD_BUG_ON(offsetof(struct dst_entry, __refcnt) & 63);
 	WARN_ON(atomic_inc_not_zero(&dst->__refcnt) == 0);
diff --git a/include/net/erspan.h b/include/net/erspan.h
index ca94fc86865e..6e758d08c9ee 100644
--- a/include/net/erspan.h
+++ b/include/net/erspan.h
@@ -58,4 +58,55 @@ struct erspanhdr {
 	struct erspan_metadata md;
 };
 
+static inline u8 tos_to_cos(u8 tos)
+{
+	u8 dscp, cos;
+
+	dscp = tos >> 2;
+	cos = dscp >> 3;
+	return cos;
+}
+
+static inline void erspan_build_header(struct sk_buff *skb,
+				__be32 id, u32 index,
+				bool truncate, bool is_ipv4)
+{
+	struct ethhdr *eth = eth_hdr(skb);
+	enum erspan_encap_type enc_type;
+	struct erspanhdr *ershdr;
+	struct qtag_prefix {
+		__be16 eth_type;
+		__be16 tci;
+	} *qp;
+	u16 vlan_tci = 0;
+	u8 tos;
+
+	tos = is_ipv4 ? ip_hdr(skb)->tos :
+			(ipv6_hdr(skb)->priority << 4) +
+			(ipv6_hdr(skb)->flow_lbl[0] >> 4);
+
+	enc_type = ERSPAN_ENCAP_NOVLAN;
+
+	/* If mirrored packet has vlan tag, extract tci and
+	 *  perserve vlan header in the mirrored frame.
+	 */
+	if (eth->h_proto == htons(ETH_P_8021Q)) {
+		qp = (struct qtag_prefix *)(skb->data + 2 * ETH_ALEN);
+		vlan_tci = ntohs(qp->tci);
+		enc_type = ERSPAN_ENCAP_INFRAME;
+	}
+
+	skb_push(skb, sizeof(*ershdr));
+	ershdr = (struct erspanhdr *)skb->data;
+	memset(ershdr, 0, sizeof(*ershdr));
+
+	ershdr->ver_vlan = htons((vlan_tci & VLAN_MASK) |
+				 (ERSPAN_VERSION << VER_OFFSET));
+	ershdr->session_id = htons((u16)(ntohl(id) & ID_MASK) |
+			   ((tos_to_cos(tos) << COS_OFFSET) & COS_MASK) |
+			   (enc_type << EN_OFFSET & EN_MASK) |
+			   ((truncate << T_OFFSET) & T_MASK));
+	ershdr->md.index = htonl(index & INDEX_MASK);
+}
+
 #endif
diff --git a/include/net/gen_stats.h b/include/net/gen_stats.h
index 304f7aa9cc01..0304ba2ae353 100644
--- a/include/net/gen_stats.h
+++ b/include/net/gen_stats.h
@@ -49,6 +49,9 @@ int gnet_stats_copy_rate_est(struct gnet_dump *d,
 int gnet_stats_copy_queue(struct gnet_dump *d,
 			  struct gnet_stats_queue __percpu *cpu_q,
 			  struct gnet_stats_queue *q, __u32 qlen);
+void __gnet_stats_copy_queue(struct gnet_stats_queue *qstats,
+			     const struct gnet_stats_queue __percpu *cpu_q,
+			     const struct gnet_stats_queue *q, __u32 qlen);
 int gnet_stats_copy_app(struct gnet_dump *d, void *st, int len);
 
 int gnet_stats_finish_copy(struct gnet_dump *d);
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index 0358745ea059..8e1bf9ae4a5e 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -77,6 +77,7 @@ struct inet_connection_sock_af_ops {
  * @icsk_af_ops		   Operations which are AF_INET{4,6} specific
  * @icsk_ulp_ops	   Pluggable ULP control hook
  * @icsk_ulp_data	   ULP private data
+ * @icsk_listen_portaddr_node	hash to the portaddr listener hashtable
  * @icsk_ca_state:	   Congestion control state
  * @icsk_retransmits:	   Number of unrecovered [RTO] timeouts
  * @icsk_pending:	   Scheduled timer event
@@ -101,6 +102,7 @@ struct inet_connection_sock {
 	const struct inet_connection_sock_af_ops *icsk_af_ops;
 	const struct tcp_ulp_ops  *icsk_ulp_ops;
 	void			  *icsk_ulp_data;
+	struct hlist_node         icsk_listen_portaddr_node;
 	unsigned int		  (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
 	__u8			  icsk_ca_state:6,
 				  icsk_ca_setsockopt:1,
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 2dbbbff5e1e3..9141e95529e7 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -111,6 +111,7 @@ struct inet_bind_hashbucket {
  */
 struct inet_listen_hashbucket {
 	spinlock_t		lock;
+	unsigned int		count;
 	struct hlist_head	head;
 };
 
@@ -132,12 +133,13 @@ struct inet_hashinfo {
 	/* Ok, let's try this, I give up, we do need a local binding
 	 * TCP hash as well as the others for fast bind/connect.
 	 */
+	struct kmem_cache		*bind_bucket_cachep;
 	struct inet_bind_hashbucket	*bhash;
-
 	unsigned int			bhash_size;
-	/* 4 bytes hole on 64 bit */
 
-	struct kmem_cache		*bind_bucket_cachep;
+	/* The 2nd listener table hashed by local port and address */
+	unsigned int			lhash2_mask;
+	struct inet_listen_hashbucket	*lhash2;
 
 	/* All the above members are written once at bootup and
 	 * never written again _or_ are predominantly read-access.
@@ -145,14 +147,25 @@ struct inet_hashinfo {
 	 * Now align to a new cache line as all the following members
 	 * might be often dirty.
 	 */
-	/* All sockets in TCP_LISTEN state will be in here.  This is the only
-	 * table where wildcard'd TCP sockets can exist.  Hash function here
-	 * is just local port number.
+	/* All sockets in TCP_LISTEN state will be in listening_hash.
+	 * This is the only table where wildcard'd TCP sockets can
+	 * exist.  listening_hash is only hashed by local port number.
+	 * If lhash2 is initialized, the same socket will also be hashed
+	 * to lhash2 by port and address.
 	 */
 	struct inet_listen_hashbucket	listening_hash[INET_LHTABLE_SIZE]
 					____cacheline_aligned_in_smp;
 };
 
+#define inet_lhash2_for_each_icsk_rcu(__icsk, list) \
+	hlist_for_each_entry_rcu(__icsk, list, icsk_listen_portaddr_node)
+
+static inline struct inet_listen_hashbucket *
+inet_lhash2_bucket(struct inet_hashinfo *h, u32 hash)
+{
+	return &h->lhash2[hash & h->lhash2_mask];
+}
+
 static inline struct inet_ehash_bucket *inet_ehash_bucket(
 	struct inet_hashinfo *hashinfo,
 	unsigned int hash)
@@ -208,6 +221,10 @@ int __inet_inherit_port(const struct sock *sk, struct sock *child);
 void inet_put_port(struct sock *sk);
 
 void inet_hashinfo_init(struct inet_hashinfo *h);
+void inet_hashinfo2_init(struct inet_hashinfo *h, const char *name,
+			 unsigned long numentries, int scale,
+			 unsigned long low_limit,
+			 unsigned long high_limit);
 
 bool inet_ehash_insert(struct sock *sk, struct sock *osk);
 bool inet_ehash_nolisten(struct sock *sk, struct sock *osk);
diff --git a/include/net/ip.h b/include/net/ip.h
index 9896f46cbbf1..fc9bf1b1fe2c 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -26,12 +26,14 @@
 #include <linux/ip.h>
 #include <linux/in.h>
 #include <linux/skbuff.h>
+#include <linux/jhash.h>
 
 #include <net/inet_sock.h>
 #include <net/route.h>
 #include <net/snmp.h>
 #include <net/flow.h>
 #include <net/flow_dissector.h>
+#include <net/netns/hash.h>
 
 #define IPV4_MAX_PMTU		65535U		/* RFC 2675, Section 5.1 */
 
@@ -521,6 +523,13 @@ static inline unsigned int ipv4_addr_hash(__be32 ip)
 	return (__force unsigned int) ip;
 }
 
+static inline u32 ipv4_portaddr_hash(const struct net *net,
+				     __be32 saddr,
+				     unsigned int port)
+{
+	return jhash_1word((__force u32)saddr, net_hash_mix(net)) ^ port;
+}
+
 bool ip_call_ra_chain(struct sk_buff *skb);
 
 /*
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 10c913816032..44d96a91e745 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -129,6 +129,8 @@ struct rt6_exception {
 
 struct rt6_info {
 	struct dst_entry		dst;
+	struct rt6_info __rcu		*rt6_next;
+	struct rt6_info			*from;
 
 	/*
 	 * Tail elements of dst_entry (__refcnt etc.)
@@ -176,11 +178,11 @@ struct rt6_info {
 
 #define for_each_fib6_node_rt_rcu(fn)					\
 	for (rt = rcu_dereference((fn)->leaf); rt;			\
-	     rt = rcu_dereference(rt->dst.rt6_next))
+	     rt = rcu_dereference(rt->rt6_next))
 
 #define for_each_fib6_walker_rt(w)					\
 	for (rt = (w)->leaf; rt;					\
-	     rt = rcu_dereference_protected(rt->dst.rt6_next, 1))
+	     rt = rcu_dereference_protected(rt->rt6_next, 1))
 
 static inline struct inet6_dev *ip6_dst_idev(struct dst_entry *dst)
 {
@@ -203,11 +205,9 @@ static inline void rt6_update_expires(struct rt6_info *rt0, int timeout)
 {
 	struct rt6_info *rt;
 
-	for (rt = rt0; rt && !(rt->rt6i_flags & RTF_EXPIRES);
-	     rt = (struct rt6_info *)rt->dst.from);
+	for (rt = rt0; rt && !(rt->rt6i_flags & RTF_EXPIRES); rt = rt->from);
 	if (rt && rt != rt0)
 		rt0->dst.expires = rt->dst.expires;
-
 	dst_set_expires(&rt0->dst, timeout);
 	rt0->rt6i_flags |= RTF_EXPIRES;
 }
@@ -242,8 +242,8 @@ static inline u32 rt6_get_cookie(const struct rt6_info *rt)
 	u32 cookie = 0;
 
 	if (rt->rt6i_flags & RTF_PCPU ||
-	    (unlikely(!list_empty(&rt->rt6i_uncached)) && rt->dst.from))
-		rt = (struct rt6_info *)(rt->dst.from);
+	    (unlikely(!list_empty(&rt->rt6i_uncached)) && rt->from))
+		rt = rt->from;
 
 	rt6_get_cookie_safe(rt, &cookie);
 
diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index d66f70f63734..109a5a8877ef 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -36,6 +36,7 @@ struct __ip6_tnl_parm {
 	__be32			o_key;
 
 	__u32			fwmark;
+	__u32			index;	/* ERSPAN type II index */
 };
 
 /* IPv6 tunnel */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index f73797e2fa60..25be4715578c 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -22,6 +22,7 @@
 #include <net/flow.h>
 #include <net/flow_dissector.h>
 #include <net/snmp.h>
+#include <net/netns/hash.h>
 
 #define SIN6_LEN_RFC2133	24
 
@@ -673,6 +674,22 @@ static inline bool ipv6_addr_v4mapped(const struct in6_addr *a)
 					cpu_to_be32(0x0000ffff))) == 0UL;
 }
 
+static inline u32 ipv6_portaddr_hash(const struct net *net,
+				     const struct in6_addr *addr6,
+				     unsigned int port)
+{
+	unsigned int hash, mix = net_hash_mix(net);
+
+	if (ipv6_addr_any(addr6))
+		hash = jhash_1word(0, mix);
+	else if (ipv6_addr_v4mapped(addr6))
+		hash = jhash_1word((__force u32)addr6->s6_addr32[3], mix);
+	else
+		hash = jhash2((__force u32 *)addr6->s6_addr32, 4, mix);
+
+	return hash ^ port;
+}
+
 /*
  * Check for a RFC 4843 ORCHID address
  * (Overlay Routable Cryptographic Hash Identifiers)
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index d1f413f06c72..240469228851 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -105,16 +105,18 @@ struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
 void qdisc_put_rtab(struct qdisc_rate_table *tab);
 void qdisc_put_stab(struct qdisc_size_table *tab);
 void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc);
-int sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
-		    struct net_device *dev, struct netdev_queue *txq,
-		    spinlock_t *root_lock, bool validate);
+bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
+		     struct net_device *dev, struct netdev_queue *txq,
+		     spinlock_t *root_lock, bool validate);
 
 void __qdisc_run(struct Qdisc *q);
 
 static inline void qdisc_run(struct Qdisc *q)
 {
-	if (qdisc_run_begin(q))
+	if (qdisc_run_begin(q)) {
 		__qdisc_run(q);
+		qdisc_run_end(q);
+	}
 }
 
 static inline __be16 tc_skb_protocol(const struct sk_buff *skb)
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index ead018744ff5..14b6b3af8918 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -13,10 +13,10 @@ enum rtnl_link_flags {
 	RTNL_FLAG_DOIT_UNLOCKED = 1,
 };
 
-int __rtnl_register(int protocol, int msgtype,
-		    rtnl_doit_func, rtnl_dumpit_func, unsigned int flags);
 void rtnl_register(int protocol, int msgtype,
 		   rtnl_doit_func, rtnl_dumpit_func, unsigned int flags);
+int rtnl_register_module(struct module *owner, int protocol, int msgtype,
+			 rtnl_doit_func, rtnl_dumpit_func, unsigned int flags);
 int rtnl_unregister(int protocol, int msgtype);
 void rtnl_unregister_all(int protocol);
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 65d0d25f2648..8f8c0afe529b 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -71,6 +71,7 @@ struct Qdisc {
 				      * qdisc_tree_decrease_qlen() should stop.
 				      */
 #define TCQ_F_INVISIBLE		0x80 /* invisible by default in dump */
+#define TCQ_F_NOLOCK		0x100 /* qdisc does not require locking */
 	u32			limit;
 	const struct Qdisc_ops	*ops;
 	struct qdisc_size_table	__rcu *stab;
@@ -87,14 +88,14 @@ struct Qdisc {
 	/*
 	 * For performance sake on SMP, we put highly modified fields at the end
 	 */
-	struct sk_buff		*gso_skb ____cacheline_aligned_in_smp;
+	struct sk_buff_head	gso_skb ____cacheline_aligned_in_smp;
 	struct qdisc_skb_head	q;
 	struct gnet_stats_basic_packed bstats;
 	seqcount_t		running;
 	struct gnet_stats_queue	qstats;
 	unsigned long		state;
 	struct Qdisc            *next_sched;
-	struct sk_buff		*skb_bad_txq;
+	struct sk_buff_head	skb_bad_txq;
 	int			padded;
 	refcount_t		refcnt;
 
@@ -161,7 +162,8 @@ struct Qdisc_class_ops {
 	void			(*walk)(struct Qdisc *, struct qdisc_walker * arg);
 
 	/* Filter manipulation */
-	struct tcf_block *	(*tcf_block)(struct Qdisc *, unsigned long);
+	struct tcf_block *	(*tcf_block)(struct Qdisc *sch,
+					     unsigned long arg);
 	unsigned long		(*bind_tcf)(struct Qdisc *, unsigned long,
 					u32 classid);
 	void			(*unbind_tcf)(struct Qdisc *, unsigned long);
@@ -178,6 +180,7 @@ struct Qdisc_ops {
 	const struct Qdisc_class_ops	*cl_ops;
 	char			id[IFNAMSIZ];
 	int			priv_size;
+	unsigned int		static_flags;
 
 	int 			(*enqueue)(struct sk_buff *skb,
 					   struct Qdisc *sch,
@@ -185,11 +188,12 @@ struct Qdisc_ops {
 	struct sk_buff *	(*dequeue)(struct Qdisc *);
 	struct sk_buff *	(*peek)(struct Qdisc *);
 
-	int			(*init)(struct Qdisc *, struct nlattr *arg);
+	int			(*init)(struct Qdisc *sch, struct nlattr *arg);
 	void			(*reset)(struct Qdisc *);
 	void			(*destroy)(struct Qdisc *);
-	int			(*change)(struct Qdisc *, struct nlattr *arg);
-	void			(*attach)(struct Qdisc *);
+	int			(*change)(struct Qdisc *sch,
+					  struct nlattr *arg);
+	void			(*attach)(struct Qdisc *sch);
 
 	int			(*dump)(struct Qdisc *, struct sk_buff *);
 	int			(*dump_stats)(struct Qdisc *, struct gnet_dump *);
@@ -278,7 +282,6 @@ struct tcf_block {
 	struct net *net;
 	struct Qdisc *q;
 	struct list_head cb_list;
-	struct work_struct work;
 };
 
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
@@ -289,11 +292,31 @@ static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
 	BUILD_BUG_ON(sizeof(qcb->data) < sz);
 }
 
+static inline int qdisc_qlen_cpu(const struct Qdisc *q)
+{
+	return this_cpu_ptr(q->cpu_qstats)->qlen;
+}
+
 static inline int qdisc_qlen(const struct Qdisc *q)
 {
 	return q->q.qlen;
 }
 
+static inline int qdisc_qlen_sum(const struct Qdisc *q)
+{
+	__u32 qlen = 0;
+	int i;
+
+	if (q->flags & TCQ_F_NOLOCK) {
+		for_each_possible_cpu(i)
+			qlen += per_cpu_ptr(q->cpu_qstats, i)->qlen;
+	} else {
+		qlen = q->q.qlen;
+	}
+
+	return qlen;
+}
+
 static inline struct qdisc_skb_cb *qdisc_skb_cb(const struct sk_buff *skb)
 {
 	return (struct qdisc_skb_cb *)skb->cb;
@@ -630,12 +653,39 @@ static inline void qdisc_qstats_backlog_dec(struct Qdisc *sch,
 	sch->qstats.backlog -= qdisc_pkt_len(skb);
 }
 
+static inline void qdisc_qstats_cpu_backlog_dec(struct Qdisc *sch,
+						const struct sk_buff *skb)
+{
+	this_cpu_sub(sch->cpu_qstats->backlog, qdisc_pkt_len(skb));
+}
+
 static inline void qdisc_qstats_backlog_inc(struct Qdisc *sch,
 					    const struct sk_buff *skb)
 {
 	sch->qstats.backlog += qdisc_pkt_len(skb);
 }
 
+static inline void qdisc_qstats_cpu_backlog_inc(struct Qdisc *sch,
+						const struct sk_buff *skb)
+{
+	this_cpu_add(sch->cpu_qstats->backlog, qdisc_pkt_len(skb));
+}
+
+static inline void qdisc_qstats_cpu_qlen_inc(struct Qdisc *sch)
+{
+	this_cpu_inc(sch->cpu_qstats->qlen);
+}
+
+static inline void qdisc_qstats_cpu_qlen_dec(struct Qdisc *sch)
+{
+	this_cpu_dec(sch->cpu_qstats->qlen);
+}
+
+static inline void qdisc_qstats_cpu_requeues_inc(struct Qdisc *sch)
+{
+	this_cpu_inc(sch->cpu_qstats->requeues);
+}
+
 static inline void __qdisc_qstats_drop(struct Qdisc *sch, int count)
 {
 	sch->qstats.drops += count;
@@ -766,26 +816,30 @@ static inline struct sk_buff *qdisc_peek_head(struct Qdisc *sch)
 /* generic pseudo peek method for non-work-conserving qdisc */
 static inline struct sk_buff *qdisc_peek_dequeued(struct Qdisc *sch)
 {
+	struct sk_buff *skb = skb_peek(&sch->gso_skb);
+
 	/* we can reuse ->gso_skb because peek isn't called for root qdiscs */
-	if (!sch->gso_skb) {
-		sch->gso_skb = sch->dequeue(sch);
-		if (sch->gso_skb) {
+	if (!skb) {
+		skb = sch->dequeue(sch);
+
+		if (skb) {
+			__skb_queue_head(&sch->gso_skb, skb);
 			/* it's still part of the queue */
-			qdisc_qstats_backlog_inc(sch, sch->gso_skb);
+			qdisc_qstats_backlog_inc(sch, skb);
 			sch->q.qlen++;
 		}
 	}
 
-	return sch->gso_skb;
+	return skb;
 }
 
 /* use instead of qdisc->dequeue() for all qdiscs queried with ->peek() */
 static inline struct sk_buff *qdisc_dequeue_peeked(struct Qdisc *sch)
 {
-	struct sk_buff *skb = sch->gso_skb;
+	struct sk_buff *skb = skb_peek(&sch->gso_skb);
 
 	if (skb) {
-		sch->gso_skb = NULL;
+		skb = __skb_dequeue(&sch->gso_skb);
 		qdisc_qstats_backlog_dec(sch, skb);
 		sch->q.qlen--;
 	} else {
@@ -843,6 +897,14 @@ static inline void rtnl_qdisc_drop(struct sk_buff *skb, struct Qdisc *sch)
 	qdisc_qstats_drop(sch);
 }
 
+static inline int qdisc_drop_cpu(struct sk_buff *skb, struct Qdisc *sch,
+				 struct sk_buff **to_free)
+{
+	__qdisc_drop(skb, to_free);
+	qdisc_qstats_cpu_drop(sch);
+
+	return NET_XMIT_DROP;
+}
 
 static inline int qdisc_drop(struct sk_buff *skb, struct Qdisc *sch,
 			     struct sk_buff **to_free)
diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h
index 21d253c9a8c6..a2e9cbca5c9e 100644
--- a/include/net/tc_act/tc_mirred.h
+++ b/include/net/tc_act/tc_mirred.h
@@ -8,10 +8,8 @@
 struct tcf_mirred {
 	struct tc_action	common;
 	int			tcfm_eaction;
-	int			tcfm_ifindex;
 	bool			tcfm_mac_header_xmit;
 	struct net_device __rcu	*tcfm_dev;
-	struct net		*net;
 	struct list_head	tcfm_list;
 };
 #define to_mirred(a) ((struct tcf_mirred *)a)
@@ -34,9 +32,9 @@ static inline bool is_tcf_mirred_egress_mirror(const struct tc_action *a)
 	return false;
 }
 
-static inline int tcf_mirred_ifindex(const struct tc_action *a)
+static inline struct net_device *tcf_mirred_dev(const struct tc_action *a)
 {
-	return to_mirred(a)->tcfm_ifindex;
+	return rtnl_dereference(to_mirred(a)->tcfm_dev);
 }
 
 #endif /* __NET_TC_MIR_H */
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6da880d2f022..3c3744e52cd1 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2011,10 +2011,12 @@ static inline int tcp_call_bpf(struct sock *sk, int op)
 	struct bpf_sock_ops_kern sock_ops;
 	int ret;
 
-	if (sk_fullsock(sk))
+	memset(&sock_ops, 0, sizeof(sock_ops));
+	if (sk_fullsock(sk)) {
+		sock_ops.is_fullsock = 1;
 		sock_owned_by_me(sk);
+	}
 
-	memset(&sock_ops, 0, sizeof(sock_ops));
 	sock_ops.sk = sk;
 	sock_ops.op = op;
 
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index dc28a98ce97c..1ec0c4760646 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -968,7 +968,7 @@ static inline bool xfrm_sec_ctx_match(struct xfrm_sec_ctx *s1, struct xfrm_sec_c
 
 /* A struct encoding bundle of transformations to apply to some set of flow.
  *
- * dst->child points to the next element of bundle.
+ * xdst->child points to the next element of bundle.
  * dst->xfrm  points to an instanse of transformer.
  *
  * Due to unfortunate limitations of current routing cache, which we
@@ -984,6 +984,8 @@ struct xfrm_dst {
 		struct rt6_info		rt6;
 	} u;
 	struct dst_entry *route;
+	struct dst_entry *child;
+	struct dst_entry *path;
 	struct xfrm_policy *pols[XFRM_POLICY_TYPE_MAX];
 	int num_pols, num_xfrms;
 	u32 xfrm_genid;
@@ -994,7 +996,35 @@ struct xfrm_dst {
 	u32 path_cookie;
 };
 
+static inline struct dst_entry *xfrm_dst_path(const struct dst_entry *dst)
+{
 #ifdef CONFIG_XFRM
+	if (dst->xfrm) {
+		const struct xfrm_dst *xdst = (const struct xfrm_dst *) dst;
+
+		return xdst->path;
+	}
+#endif
+	return (struct dst_entry *) dst;
+}
+
+static inline struct dst_entry *xfrm_dst_child(const struct dst_entry *dst)
+{
+#ifdef CONFIG_XFRM
+	if (dst->xfrm) {
+		struct xfrm_dst *xdst = (struct xfrm_dst *) dst;
+		return xdst->child;
+	}
+#endif
+	return NULL;
+}
+
+#ifdef CONFIG_XFRM
+static inline void xfrm_dst_set_child(struct xfrm_dst *xdst, struct dst_entry *child)
+{
+	xdst->child = child;
+}
+
 static inline void xfrm_dst_destroy(struct xfrm_dst *xdst)
 {
 	xfrm_pols_put(xdst->pols, xdst->num_pols);
@@ -1866,12 +1896,14 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x);
 static inline bool xfrm_dst_offload_ok(struct dst_entry *dst)
 {
 	struct xfrm_state *x = dst->xfrm;
+	struct xfrm_dst *xdst;
 
 	if (!x || !x->type_offload)
 		return false;
 
-	if (x->xso.offload_handle && (x->xso.dev == dst->path->dev) &&
-	    !dst->child->xfrm)
+	xdst = (struct xfrm_dst *) dst;
+	if (x->xso.offload_handle && (x->xso.dev == xfrm_dst_path(dst)->dev) &&
+	    !xdst->child->xfrm)
 		return true;
 
 	return false;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4c223ab30293..80d62e88590c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -941,6 +941,12 @@ struct bpf_sock_ops {
 	__u32 local_ip6[4];	/* Stored in network byte order */
 	__u32 remote_port;	/* Stored in network byte order */
 	__u32 local_port;	/* stored in host byte order */
+	__u32 is_fullsock;	/* Some TCP fields are only valid if
+				 * there is a full socket. If not, the
+				 * fields read as zero.
+				 */
+	__u32 snd_cwnd;
+	__u32 srtt_us;		/* Averaged RTT << 3 in usecs */
 };
 
 /* List of known BPF sock_ops operators.
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index ac71559314e7..44a0b675a6bc 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -1686,6 +1686,7 @@ enum ethtool_reset_flags {
 	ETH_RESET_PHY		= 1 << 6,	/* Transceiver/PHY */
 	ETH_RESET_RAM		= 1 << 7,	/* RAM shared between
 						 * multiple components */
+	ETH_RESET_AP		= 1 << 8,	/* Application processor */
 
 	ETH_RESET_DEDICATED	= 0x0000ffff,	/* All components dedicated to
 						 * this interface */
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 030d3e6d6029..fb38c1797131 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -57,6 +57,7 @@
  */
 #define TUNSETVNETBE _IOW('T', 222, int)
 #define TUNGETVNETBE _IOR('T', 223, int)
+#define TUNSETSTEERINGEBPF _IOR('T', 224, int)
 
 /* TUNSETIFF ifr flags */
 #define IFF_TUN		0x0001
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d4593571c404..7afa92e9b409 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -216,6 +216,17 @@ static const char * const reg_type_str[] = {
 	[PTR_TO_PACKET_END]	= "pkt_end",
 };
 
+static void print_liveness(struct bpf_verifier_env *env,
+			   enum bpf_reg_liveness live)
+{
+	if (live & (REG_LIVE_READ | REG_LIVE_WRITTEN))
+	    verbose(env, "_");
+	if (live & REG_LIVE_READ)
+		verbose(env, "r");
+	if (live & REG_LIVE_WRITTEN)
+		verbose(env, "w");
+}
+
 static void print_verifier_state(struct bpf_verifier_env *env,
 				 struct bpf_verifier_state *state)
 {
@@ -228,7 +239,9 @@ static void print_verifier_state(struct bpf_verifier_env *env,
 		t = reg->type;
 		if (t == NOT_INIT)
 			continue;
-		verbose(env, " R%d=%s", i, reg_type_str[t]);
+		verbose(env, " R%d", i);
+		print_liveness(env, reg->live);
+		verbose(env, "=%s", reg_type_str[t]);
 		if ((t == SCALAR_VALUE || t == PTR_TO_STACK) &&
 		    tnum_is_const(reg->var_off)) {
 			/* reg->off should be 0 for SCALAR_VALUE */
@@ -277,10 +290,13 @@ static void print_verifier_state(struct bpf_verifier_env *env,
 		}
 	}
 	for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
-		if (state->stack[i].slot_type[0] == STACK_SPILL)
-			verbose(env, " fp%d=%s",
-				-MAX_BPF_STACK + i * BPF_REG_SIZE,
+		if (state->stack[i].slot_type[0] == STACK_SPILL) {
+			verbose(env, " fp%d",
+				(-i - 1) * BPF_REG_SIZE);
+			print_liveness(env, state->stack[i].spilled_ptr.live);
+			verbose(env, "=%s",
 				reg_type_str[state->stack[i].spilled_ptr.type]);
+		}
 	}
 	verbose(env, "\n");
 }
@@ -568,8 +584,8 @@ static void mark_reg_unknown(struct bpf_verifier_env *env,
 {
 	if (WARN_ON(regno >= MAX_BPF_REG)) {
 		verbose(env, "mark_reg_unknown(regs, %u)\n", regno);
-		/* Something bad happened, let's kill all regs */
-		for (regno = 0; regno < MAX_BPF_REG; regno++)
+		/* Something bad happened, let's kill all regs except FP */
+		for (regno = 0; regno < BPF_REG_FP; regno++)
 			__mark_reg_not_init(regs + regno);
 		return;
 	}
@@ -587,8 +603,8 @@ static void mark_reg_not_init(struct bpf_verifier_env *env,
 {
 	if (WARN_ON(regno >= MAX_BPF_REG)) {
 		verbose(env, "mark_reg_not_init(regs, %u)\n", regno);
-		/* Something bad happened, let's kill all regs */
-		for (regno = 0; regno < MAX_BPF_REG; regno++)
+		/* Something bad happened, let's kill all regs except FP */
+		for (regno = 0; regno < BPF_REG_FP; regno++)
 			__mark_reg_not_init(regs + regno);
 		return;
 	}
@@ -779,6 +795,11 @@ static int check_stack_read(struct bpf_verifier_env *env,
 		if (value_regno >= 0) {
 			/* restore register state from stack */
 			state->regs[value_regno] = state->stack[spi].spilled_ptr;
+			/* mark reg as written since spilled pointer state likely
+			 * has its liveness marks cleared by is_state_visited()
+			 * which resets stack/reg liveness for state transitions
+			 */
+			state->regs[value_regno].live |= REG_LIVE_WRITTEN;
 			mark_stack_slot_read(state, spi);
 		}
 		return 0;
@@ -1244,9 +1265,9 @@ static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_ins
 }
 
 /* Does this register contain a constant zero? */
-static bool register_is_null(struct bpf_reg_state reg)
+static bool register_is_null(struct bpf_reg_state *reg)
 {
-	return reg.type == SCALAR_VALUE && tnum_equals_const(reg.var_off, 0);
+	return reg->type == SCALAR_VALUE && tnum_equals_const(reg->var_off, 0);
 }
 
 /* when register 'regno' is passed into function that will read 'access_size'
@@ -1259,31 +1280,31 @@ static int check_stack_boundary(struct bpf_verifier_env *env, int regno,
 				int access_size, bool zero_size_allowed,
 				struct bpf_call_arg_meta *meta)
 {
+	struct bpf_reg_state *reg = cur_regs(env) + regno;
 	struct bpf_verifier_state *state = env->cur_state;
-	struct bpf_reg_state *regs = state->regs;
 	int off, i, slot, spi;
 
-	if (regs[regno].type != PTR_TO_STACK) {
+	if (reg->type != PTR_TO_STACK) {
 		/* Allow zero-byte read from NULL, regardless of pointer type */
 		if (zero_size_allowed && access_size == 0 &&
-		    register_is_null(regs[regno]))
+		    register_is_null(reg))
 			return 0;
 
 		verbose(env, "R%d type=%s expected=%s\n", regno,
-			reg_type_str[regs[regno].type],
+			reg_type_str[reg->type],
 			reg_type_str[PTR_TO_STACK]);
 		return -EACCES;
 	}
 
 	/* Only allow fixed-offset stack reads */
-	if (!tnum_is_const(regs[regno].var_off)) {
+	if (!tnum_is_const(reg->var_off)) {
 		char tn_buf[48];
 
-		tnum_strn(tn_buf, sizeof(tn_buf), regs[regno].var_off);
+		tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
 		verbose(env, "invalid variable stack read R%d var_off=%s\n",
 			regno, tn_buf);
 	}
-	off = regs[regno].off + regs[regno].var_off.value;
+	off = reg->off + reg->var_off.value;
 	if (off >= 0 || off < -MAX_BPF_STACK || off + access_size > 0 ||
 	    access_size < 0 || (access_size == 0 && !zero_size_allowed)) {
 		verbose(env, "invalid stack type R%d off=%d access_size=%d\n",
@@ -1391,7 +1412,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 		 * passed in as argument, it's a SCALAR_VALUE type. Final test
 		 * happens during stack boundary checking.
 		 */
-		if (register_is_null(*reg) &&
+		if (register_is_null(reg) &&
 		    arg_type == ARG_PTR_TO_MEM_OR_NULL)
 			/* final test in check_stack_boundary() */;
 		else if (!type_is_pkt_pointer(type) &&
@@ -2934,8 +2955,9 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 	if (BPF_SRC(insn->code) == BPF_K &&
 	    (opcode == BPF_JEQ || opcode == BPF_JNE) &&
 	    dst_reg->type == SCALAR_VALUE &&
-	    tnum_equals_const(dst_reg->var_off, insn->imm)) {
-		if (opcode == BPF_JEQ) {
+	    tnum_is_const(dst_reg->var_off)) {
+		if ((opcode == BPF_JEQ && dst_reg->var_off.value == insn->imm) ||
+		    (opcode == BPF_JNE && dst_reg->var_off.value != insn->imm)) {
 			/* if (imm == imm) goto pc+off;
 			 * only follow the goto, ignore fall-through
 			 */
diff --git a/net/atm/common.c b/net/atm/common.c
index 8a4f99114cd2..5763fd241dc3 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -14,7 +14,7 @@
 #include <linux/capability.h>
 #include <linux/mm.h>
 #include <linux/sched/signal.h>
-#include <linux/time.h>		/* struct timeval */
+#include <linux/time64.h>	/* 64-bit time for seconds */
 #include <linux/skbuff.h>
 #include <linux/bitops.h>
 #include <linux/init.h>
diff --git a/net/atm/mpc.c b/net/atm/mpc.c
index 7c6a1cc760a2..31e0dcb970f8 100644
--- a/net/atm/mpc.c
+++ b/net/atm/mpc.c
@@ -1089,7 +1089,7 @@ static void MPOA_trigger_rcvd(struct k_message *msg, struct mpoa_client *mpc)
 		msg->type = SND_MPOA_RES_RQST;
 		msg->content.in_info = entry->ctrl_info;
 		msg_to_mpoad(msg, mpc);
-		do_gettimeofday(&(entry->reply_wait));
+		entry->reply_wait = ktime_get_seconds();
 		mpc->in_ops->put(entry);
 		return;
 	}
@@ -1099,7 +1099,7 @@ static void MPOA_trigger_rcvd(struct k_message *msg, struct mpoa_client *mpc)
 		msg->type = SND_MPOA_RES_RQST;
 		msg->content.in_info = entry->ctrl_info;
 		msg_to_mpoad(msg, mpc);
-		do_gettimeofday(&(entry->reply_wait));
+		entry->reply_wait = ktime_get_seconds();
 		mpc->in_ops->put(entry);
 		return;
 	}
@@ -1175,8 +1175,9 @@ static void MPOA_res_reply_rcvd(struct k_message *msg, struct mpoa_client *mpc)
 	}
 
 	entry->ctrl_info = msg->content.in_info;
-	do_gettimeofday(&(entry->tv));
-	do_gettimeofday(&(entry->reply_wait)); /* Used in refreshing func from now on */
+	entry->time = ktime_get_seconds();
+	/* Used in refreshing func from now on */
+	entry->reply_wait = ktime_get_seconds();
 	entry->refresh_time = 0;
 	ddprintk_cont("entry->shortcut = %p\n", entry->shortcut);
 
diff --git a/net/atm/mpoa_caches.c b/net/atm/mpoa_caches.c
index e01450bb32d6..4bb418313720 100644
--- a/net/atm/mpoa_caches.c
+++ b/net/atm/mpoa_caches.c
@@ -117,7 +117,7 @@ static in_cache_entry *in_cache_add_entry(__be32 dst_ip,
 
 	memcpy(entry->MPS_ctrl_ATM_addr, client->mps_ctrl_addr, ATM_ESA_LEN);
 	entry->ctrl_info.in_dst_ip = dst_ip;
-	do_gettimeofday(&(entry->tv));
+	entry->time = ktime_get_seconds();
 	entry->retry_time = client->parameters.mpc_p4;
 	entry->count = 1;
 	entry->entry_state = INGRESS_INVALID;
@@ -148,7 +148,7 @@ static int cache_hit(in_cache_entry *entry, struct mpoa_client *mpc)
 			if (qos != NULL)
 				msg.qos = qos->qos;
 			msg_to_mpoad(&msg, mpc);
-			do_gettimeofday(&(entry->reply_wait));
+			entry->reply_wait = ktime_get_seconds();
 			entry->entry_state = INGRESS_RESOLVING;
 		}
 		if (entry->shortcut != NULL)
@@ -171,7 +171,7 @@ static int cache_hit(in_cache_entry *entry, struct mpoa_client *mpc)
 		if (qos != NULL)
 			msg.qos = qos->qos;
 		msg_to_mpoad(&msg, mpc);
-		do_gettimeofday(&(entry->reply_wait));
+		entry->reply_wait = ktime_get_seconds();
 	}
 
 	return CLOSED;
@@ -227,17 +227,16 @@ static void in_cache_remove_entry(in_cache_entry *entry,
 static void clear_count_and_expired(struct mpoa_client *client)
 {
 	in_cache_entry *entry, *next_entry;
-	struct timeval now;
+	time64_t now;
 
-	do_gettimeofday(&now);
+	now = ktime_get_seconds();
 
 	write_lock_bh(&client->ingress_lock);
 	entry = client->in_cache;
 	while (entry != NULL) {
 		entry->count = 0;
 		next_entry = entry->next;
-		if ((now.tv_sec - entry->tv.tv_sec)
-		   > entry->ctrl_info.holding_time) {
+		if ((now - entry->time) > entry->ctrl_info.holding_time) {
 			dprintk("holding time expired, ip = %pI4\n",
 				&entry->ctrl_info.in_dst_ip);
 			client->in_ops->remove_entry(entry, client);
@@ -253,35 +252,35 @@ static void check_resolving_entries(struct mpoa_client *client)
 
 	struct atm_mpoa_qos *qos;
 	in_cache_entry *entry;
-	struct timeval now;
+	time64_t now;
 	struct k_message msg;
 
-	do_gettimeofday(&now);
+	now = ktime_get_seconds();
 
 	read_lock_bh(&client->ingress_lock);
 	entry = client->in_cache;
 	while (entry != NULL) {
 		if (entry->entry_state == INGRESS_RESOLVING) {
-			if ((now.tv_sec - entry->hold_down.tv_sec) <
-			    client->parameters.mpc_p6) {
+
+			if ((now - entry->hold_down)
+					< client->parameters.mpc_p6) {
 				entry = entry->next;	/* Entry in hold down */
 				continue;
 			}
-			if ((now.tv_sec - entry->reply_wait.tv_sec) >
-			    entry->retry_time) {
+			if ((now - entry->reply_wait) > entry->retry_time) {
 				entry->retry_time = MPC_C1 * (entry->retry_time);
 				/*
 				 * Retry time maximum exceeded,
 				 * put entry in hold down.
 				 */
 				if (entry->retry_time > client->parameters.mpc_p5) {
-					do_gettimeofday(&(entry->hold_down));
+					entry->hold_down = ktime_get_seconds();
 					entry->retry_time = client->parameters.mpc_p4;
 					entry = entry->next;
 					continue;
 				}
 				/* Ask daemon to send a resolution request. */
-				memset(&(entry->hold_down), 0, sizeof(struct timeval));
+				memset(&entry->hold_down, 0, sizeof(time64_t));
 				msg.type = SND_MPOA_RES_RTRY;
 				memcpy(msg.MPS_ctrl, client->mps_ctrl_addr, ATM_ESA_LEN);
 				msg.content.in_info = entry->ctrl_info;
@@ -289,7 +288,7 @@ static void check_resolving_entries(struct mpoa_client *client)
 				if (qos != NULL)
 					msg.qos = qos->qos;
 				msg_to_mpoad(&msg, client);
-				do_gettimeofday(&(entry->reply_wait));
+				entry->reply_wait = ktime_get_seconds();
 			}
 		}
 		entry = entry->next;
@@ -300,18 +299,18 @@ static void check_resolving_entries(struct mpoa_client *client)
 /* Call this every MPC-p5 seconds. */
 static void refresh_entries(struct mpoa_client *client)
 {
-	struct timeval now;
+	time64_t now;
 	struct in_cache_entry *entry = client->in_cache;
 
 	ddprintk("refresh_entries\n");
-	do_gettimeofday(&now);
+	now = ktime_get_seconds();
 
 	read_lock_bh(&client->ingress_lock);
 	while (entry != NULL) {
 		if (entry->entry_state == INGRESS_RESOLVED) {
 			if (!(entry->refresh_time))
 				entry->refresh_time = (2 * (entry->ctrl_info.holding_time))/3;
-			if ((now.tv_sec - entry->reply_wait.tv_sec) >
+			if ((now - entry->reply_wait) >
 			    entry->refresh_time) {
 				dprintk("refreshing an entry.\n");
 				entry->entry_state = INGRESS_REFRESHING;
@@ -480,7 +479,7 @@ static eg_cache_entry *eg_cache_add_entry(struct k_message *msg,
 
 	memcpy(entry->MPS_ctrl_ATM_addr, client->mps_ctrl_addr, ATM_ESA_LEN);
 	entry->ctrl_info = msg->content.eg_info;
-	do_gettimeofday(&(entry->tv));
+	entry->time = ktime_get_seconds();
 	entry->entry_state = EGRESS_RESOLVED;
 	dprintk("new_eg_cache_entry cache_id %u\n",
 		ntohl(entry->ctrl_info.cache_id));
@@ -495,7 +494,7 @@ static eg_cache_entry *eg_cache_add_entry(struct k_message *msg,
 
 static void update_eg_cache_entry(eg_cache_entry *entry, uint16_t holding_time)
 {
-	do_gettimeofday(&(entry->tv));
+	entry->time = ktime_get_seconds();
 	entry->entry_state = EGRESS_RESOLVED;
 	entry->ctrl_info.holding_time = holding_time;
 }
@@ -503,17 +502,16 @@ static void update_eg_cache_entry(eg_cache_entry *entry, uint16_t holding_time)
 static void clear_expired(struct mpoa_client *client)
 {
 	eg_cache_entry *entry, *next_entry;
-	struct timeval now;
+	time64_t now;
 	struct k_message msg;
 
-	do_gettimeofday(&now);
+	now = ktime_get_seconds();
 
 	write_lock_irq(&client->egress_lock);
 	entry = client->eg_cache;
 	while (entry != NULL) {
 		next_entry = entry->next;
-		if ((now.tv_sec - entry->tv.tv_sec)
-		   > entry->ctrl_info.holding_time) {
+		if ((now - entry->time) > entry->ctrl_info.holding_time) {
 			msg.type = SND_EGRESS_PURGE;
 			msg.content.eg_info = entry->ctrl_info;
 			dprintk("egress_cache: holding time expired, cache_id = %u.\n",
diff --git a/net/atm/mpoa_caches.h b/net/atm/mpoa_caches.h
index 6a266669ebf4..464c4c7f8d1f 100644
--- a/net/atm/mpoa_caches.h
+++ b/net/atm/mpoa_caches.h
@@ -2,6 +2,7 @@
 #ifndef MPOA_CACHES_H
 #define MPOA_CACHES_H
 
+#include <linux/time64.h>
 #include <linux/netdevice.h>
 #include <linux/types.h>
 #include <linux/atm.h>
@@ -16,9 +17,9 @@ void atm_mpoa_init_cache(struct mpoa_client *mpc);
 typedef struct in_cache_entry {
 	struct in_cache_entry *next;
 	struct in_cache_entry *prev;
-	struct timeval  tv;
-	struct timeval  reply_wait;
-	struct timeval  hold_down;
+	time64_t  time;
+	time64_t  reply_wait;
+	time64_t  hold_down;
 	uint32_t  packets_fwded;
 	uint16_t  entry_state;
 	uint32_t retry_time;
@@ -53,7 +54,7 @@ struct in_cache_ops{
 typedef struct eg_cache_entry{
 	struct               eg_cache_entry *next;
 	struct               eg_cache_entry *prev;
-	struct               timeval  tv;
+	time64_t	     time;
 	uint8_t              MPS_ctrl_ATM_addr[ATM_ESA_LEN];
 	struct atm_vcc       *shortcut;
 	uint32_t             packets_rcvd;
diff --git a/net/atm/mpoa_proc.c b/net/atm/mpoa_proc.c
index 8a0c17e1c203..2212da9c2da2 100644
--- a/net/atm/mpoa_proc.c
+++ b/net/atm/mpoa_proc.c
@@ -8,7 +8,7 @@
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/proc_fs.h>
-#include <linux/time.h>
+#include <linux/ktime.h>
 #include <linux/seq_file.h>
 #include <linux/uaccess.h>
 #include <linux/atmmpc.h>
@@ -138,7 +138,7 @@ static int mpc_show(struct seq_file *m, void *v)
 	int i;
 	in_cache_entry *in_entry;
 	eg_cache_entry *eg_entry;
-	struct timeval now;
+	time64_t now;
 	unsigned char ip_string[16];
 
 	if (v == SEQ_START_TOKEN) {
@@ -148,15 +148,17 @@ static int mpc_show(struct seq_file *m, void *v)
 
 	seq_printf(m, "\nInterface %d:\n\n", mpc->dev_num);
 	seq_printf(m, "Ingress Entries:\nIP address      State      Holding time  Packets fwded  VPI  VCI\n");
-	do_gettimeofday(&now);
+	now = ktime_get_seconds();
 
 	for (in_entry = mpc->in_cache; in_entry; in_entry = in_entry->next) {
+		unsigned long seconds_delta = now - in_entry->time;
+
 		sprintf(ip_string, "%pI4", &in_entry->ctrl_info.in_dst_ip);
 		seq_printf(m, "%-16s%s%-14lu%-12u",
 			   ip_string,
 			   ingress_state_string(in_entry->entry_state),
 			   in_entry->ctrl_info.holding_time -
-			   (now.tv_sec-in_entry->tv.tv_sec),
+			   seconds_delta,
 			   in_entry->packets_fwded);
 		if (in_entry->shortcut)
 			seq_printf(m, "   %-3d  %-3d",
@@ -169,13 +171,14 @@ static int mpc_show(struct seq_file *m, void *v)
 	seq_printf(m, "Egress Entries:\nIngress MPC ATM addr\nCache-id        State      Holding time  Packets recvd  Latest IP addr   VPI VCI\n");
 	for (eg_entry = mpc->eg_cache; eg_entry; eg_entry = eg_entry->next) {
 		unsigned char *p = eg_entry->ctrl_info.in_MPC_data_ATM_addr;
+		unsigned long seconds_delta = now - eg_entry->time;
+
 		for (i = 0; i < ATM_ESA_LEN; i++)
 			seq_printf(m, "%02x", p[i]);
 		seq_printf(m, "\n%-16lu%s%-14lu%-15u",
 			   (unsigned long)ntohl(eg_entry->ctrl_info.cache_id),
 			   egress_state_string(eg_entry->entry_state),
-			   (eg_entry->ctrl_info.holding_time -
-			    (now.tv_sec-eg_entry->tv.tv_sec)),
+			   (eg_entry->ctrl_info.holding_time - seconds_delta),
 			   eg_entry->packets_rcvd);
 
 		/* latest IP address */
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index b0f4c734900b..6d9f48bd374a 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -760,9 +760,9 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 void br_mdb_init(void)
 {
-	rtnl_register(PF_BRIDGE, RTM_GETMDB, NULL, br_mdb_dump, 0);
-	rtnl_register(PF_BRIDGE, RTM_NEWMDB, br_mdb_add, NULL, 0);
-	rtnl_register(PF_BRIDGE, RTM_DELMDB, br_mdb_del, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_GETMDB, NULL, br_mdb_dump, 0);
+	rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_NEWMDB, br_mdb_add, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_DELMDB, br_mdb_del, NULL, 0);
 }
 
 void br_mdb_uninit(void)
diff --git a/net/bridge/br_nf_core.c b/net/bridge/br_nf_core.c
index 20cbb727df4d..8e2d7cfa4e16 100644
--- a/net/bridge/br_nf_core.c
+++ b/net/bridge/br_nf_core.c
@@ -78,7 +78,6 @@ void br_netfilter_rtable_init(struct net_bridge *br)
 
 	atomic_set(&rt->dst.__refcnt, 1);
 	rt->dst.dev = br->dev;
-	rt->dst.path = &rt->dst;
 	dst_init_metrics(&rt->dst, br_dst_default_metrics, true);
 	rt->dst.flags	= DST_NOXFRM | DST_FAKE_RTABLE;
 	rt->dst.ops = &fake_dst_ops;
diff --git a/net/can/gw.c b/net/can/gw.c
index 73a02af4b5d7..398dd0395ad9 100644
--- a/net/can/gw.c
+++ b/net/can/gw.c
@@ -1014,6 +1014,8 @@ static struct pernet_operations cangw_pernet_ops = {
 
 static __init int cgw_module_init(void)
 {
+	int ret;
+
 	/* sanitize given module parameter */
 	max_hops = clamp_t(unsigned int, max_hops, CGW_MIN_HOPS, CGW_MAX_HOPS);
 
@@ -1031,15 +1033,19 @@ static __init int cgw_module_init(void)
 	notifier.notifier_call = cgw_notifier;
 	register_netdevice_notifier(&notifier);
 
-	if (__rtnl_register(PF_CAN, RTM_GETROUTE, NULL, cgw_dump_jobs, 0)) {
+	ret = rtnl_register_module(THIS_MODULE, PF_CAN, RTM_GETROUTE,
+				   NULL, cgw_dump_jobs, 0);
+	if (ret) {
 		unregister_netdevice_notifier(&notifier);
 		kmem_cache_destroy(cgw_cache);
 		return -ENOBUFS;
 	}
 
-	/* Only the first call to __rtnl_register can fail */
-	__rtnl_register(PF_CAN, RTM_NEWROUTE, cgw_create_job, NULL, 0);
-	__rtnl_register(PF_CAN, RTM_DELROUTE, cgw_remove_job, NULL, 0);
+	/* Only the first call to rtnl_register_module can fail */
+	rtnl_register_module(THIS_MODULE, PF_CAN, RTM_NEWROUTE,
+			     cgw_create_job, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_CAN, RTM_DELROUTE,
+			     cgw_remove_job, NULL, 0);
 
 	return 0;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index f47e96b62308..8aa2f70995e8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3162,6 +3162,21 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
 	int rc;
 
 	qdisc_calculate_pkt_len(skb, q);
+
+	if (q->flags & TCQ_F_NOLOCK) {
+		if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
+			__qdisc_drop(skb, &to_free);
+			rc = NET_XMIT_DROP;
+		} else {
+			rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
+			__qdisc_run(q);
+		}
+
+		if (unlikely(to_free))
+			kfree_skb_list(to_free);
+		return rc;
+	}
+
 	/*
 	 * Heuristic to force contended enqueues to serialize on a
 	 * separate lock before trying to get qdisc main lock.
@@ -3192,9 +3207,9 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
 				contended = false;
 			}
 			__qdisc_run(q);
-		} else
-			qdisc_run_end(q);
+		}
 
+		qdisc_run_end(q);
 		rc = NET_XMIT_SUCCESS;
 	} else {
 		rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
@@ -3204,6 +3219,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
 				contended = false;
 			}
 			__qdisc_run(q);
+			qdisc_run_end(q);
 		}
 	}
 	spin_unlock(root_lock);
@@ -4143,19 +4159,22 @@ static __latent_entropy void net_tx_action(struct softirq_action *h)
 
 		while (head) {
 			struct Qdisc *q = head;
-			spinlock_t *root_lock;
+			spinlock_t *root_lock = NULL;
 
 			head = head->next_sched;
 
-			root_lock = qdisc_lock(q);
-			spin_lock(root_lock);
+			if (!(q->flags & TCQ_F_NOLOCK)) {
+				root_lock = qdisc_lock(q);
+				spin_lock(root_lock);
+			}
 			/* We need to make sure head->next_sched is read
 			 * before clearing __QDISC_STATE_SCHED
 			 */
 			smp_mb__before_atomic();
 			clear_bit(__QDISC_STATE_SCHED, &q->state);
 			qdisc_run(q);
-			spin_unlock(root_lock);
+			if (root_lock)
+				spin_unlock(root_lock);
 		}
 	}
 }
@@ -7073,17 +7092,21 @@ int dev_change_proto_down(struct net_device *dev, bool proto_down)
 }
 EXPORT_SYMBOL(dev_change_proto_down);
 
-u8 __dev_xdp_attached(struct net_device *dev, bpf_op_t bpf_op, u32 *prog_id)
+void __dev_xdp_query(struct net_device *dev, bpf_op_t bpf_op,
+		     struct netdev_bpf *xdp)
 {
-	struct netdev_bpf xdp;
-
-	memset(&xdp, 0, sizeof(xdp));
-	xdp.command = XDP_QUERY_PROG;
+	memset(xdp, 0, sizeof(*xdp));
+	xdp->command = XDP_QUERY_PROG;
 
 	/* Query must always succeed. */
-	WARN_ON(bpf_op(dev, &xdp) < 0);
-	if (prog_id)
-		*prog_id = xdp.prog_id;
+	WARN_ON(bpf_op(dev, xdp) < 0);
+}
+
+static u8 __dev_xdp_attached(struct net_device *dev, bpf_op_t bpf_op)
+{
+	struct netdev_bpf xdp;
+
+	__dev_xdp_query(dev, bpf_op, &xdp);
 
 	return xdp.prog_attached;
 }
@@ -7106,6 +7129,27 @@ static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op,
 	return bpf_op(dev, &xdp);
 }
 
+static void dev_xdp_uninstall(struct net_device *dev)
+{
+	struct netdev_bpf xdp;
+	bpf_op_t ndo_bpf;
+
+	/* Remove generic XDP */
+	WARN_ON(dev_xdp_install(dev, generic_xdp_install, NULL, 0, NULL));
+
+	/* Remove from the driver */
+	ndo_bpf = dev->netdev_ops->ndo_bpf;
+	if (!ndo_bpf)
+		return;
+
+	__dev_xdp_query(dev, ndo_bpf, &xdp);
+	if (xdp.prog_attached == XDP_ATTACHED_NONE)
+		return;
+
+	/* Program removal should always succeed */
+	WARN_ON(dev_xdp_install(dev, ndo_bpf, NULL, xdp.prog_flags, NULL));
+}
+
 /**
  *	dev_change_xdp_fd - set or clear a bpf program for a device rx path
  *	@dev: device
@@ -7134,10 +7178,10 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 		bpf_chk = generic_xdp_install;
 
 	if (fd >= 0) {
-		if (bpf_chk && __dev_xdp_attached(dev, bpf_chk, NULL))
+		if (bpf_chk && __dev_xdp_attached(dev, bpf_chk))
 			return -EEXIST;
 		if ((flags & XDP_FLAGS_UPDATE_IF_NOEXIST) &&
-		    __dev_xdp_attached(dev, bpf_op, NULL))
+		    __dev_xdp_attached(dev, bpf_op))
 			return -EBUSY;
 
 		prog = bpf_prog_get_type_dev(fd, BPF_PROG_TYPE_XDP,
@@ -7236,6 +7280,7 @@ static void rollback_registered_many(struct list_head *head)
 		/* Shutdown queueing discipline. */
 		dev_shutdown(dev);
 
+		dev_xdp_uninstall(dev);
 
 		/* Notify protocols, that we are about to destroy
 		 * this device. They should clean all the things.
@@ -8195,7 +8240,6 @@ EXPORT_SYMBOL(alloc_netdev_mqs);
 void free_netdev(struct net_device *dev)
 {
 	struct napi_struct *p, *n;
-	struct bpf_prog *prog;
 
 	might_sleep();
 	netif_free_tx_queues(dev);
@@ -8214,12 +8258,6 @@ void free_netdev(struct net_device *dev)
 	free_percpu(dev->pcpu_refcnt);
 	dev->pcpu_refcnt = NULL;
 
-	prog = rcu_dereference_protected(dev->xdp_prog, 1);
-	if (prog) {
-		bpf_prog_put(prog);
-		static_key_slow_dec(&generic_xdp_needed);
-	}
-
 	/*  Compatibility with error handling in drivers */
 	if (dev->reg_state == NETREG_UNINITIALIZED) {
 		netdev_freemem(dev);
diff --git a/net/core/dst.c b/net/core/dst.c
index 662a2d4a3d19..007aa0b08291 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -21,6 +21,7 @@
 #include <linux/sched.h>
 #include <linux/prefetch.h>
 #include <net/lwtunnel.h>
+#include <net/xfrm.h>
 
 #include <net/dst.h>
 #include <net/dst_metadata.h>
@@ -62,15 +63,12 @@ void dst_init(struct dst_entry *dst, struct dst_ops *ops,
 	      struct net_device *dev, int initial_ref, int initial_obsolete,
 	      unsigned short flags)
 {
-	dst->child = NULL;
 	dst->dev = dev;
 	if (dev)
 		dev_hold(dev);
 	dst->ops = ops;
 	dst_init_metrics(dst, dst_default_metrics.metrics, true);
 	dst->expires = 0UL;
-	dst->path = dst;
-	dst->from = NULL;
 #ifdef CONFIG_XFRM
 	dst->xfrm = NULL;
 #endif
@@ -88,7 +86,6 @@ void dst_init(struct dst_entry *dst, struct dst_ops *ops,
 	dst->__use = 0;
 	dst->lastuse = jiffies;
 	dst->flags = flags;
-	dst->next = NULL;
 	if (!(flags & DST_NOCOUNT))
 		dst_entries_add(ops, 1);
 }
@@ -116,12 +113,17 @@ EXPORT_SYMBOL(dst_alloc);
 
 struct dst_entry *dst_destroy(struct dst_entry * dst)
 {
-	struct dst_entry *child;
+	struct dst_entry *child = NULL;
 
 	smp_rmb();
 
-	child = dst->child;
+#ifdef CONFIG_XFRM
+	if (dst->xfrm) {
+		struct xfrm_dst *xdst = (struct xfrm_dst *) dst;
 
+		child = xdst->child;
+	}
+#endif
 	if (!(dst->flags & DST_NOCOUNT))
 		dst_entries_add(dst->ops, -1);
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 6a85e67fafce..754abe1041b7 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3013,6 +3013,8 @@ BPF_CALL_4(bpf_skb_set_tunnel_key, struct sk_buff *, skb,
 	info->key.tun_flags = TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_NOCACHE;
 	if (flags & BPF_F_DONT_FRAGMENT)
 		info->key.tun_flags |= TUNNEL_DONT_FRAGMENT;
+	if (flags & BPF_F_ZERO_CSUM_TX)
+		info->key.tun_flags &= ~TUNNEL_CSUM;
 
 	info->key.tun_id = cpu_to_be64(from->tunnel_id);
 	info->key.tos = from->tunnel_tos;
@@ -3026,8 +3028,6 @@ BPF_CALL_4(bpf_skb_set_tunnel_key, struct sk_buff *, skb,
 				  IPV6_FLOWLABEL_MASK;
 	} else {
 		info->key.u.ipv4.dst = cpu_to_be32(from->remote_ipv4);
-		if (flags & BPF_F_ZERO_CSUM_TX)
-			info->key.tun_flags &= ~TUNNEL_CSUM;
 	}
 
 	return 0;
@@ -4437,6 +4437,42 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
 		*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
 				      offsetof(struct sock_common, skc_num));
 		break;
+
+	case offsetof(struct bpf_sock_ops, is_fullsock):
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
+						struct bpf_sock_ops_kern,
+						is_fullsock),
+				      si->dst_reg, si->src_reg,
+				      offsetof(struct bpf_sock_ops_kern,
+					       is_fullsock));
+		break;
+
+/* Helper macro for adding read access to tcp_sock fields. */
+#define SOCK_OPS_GET_TCP32(FIELD_NAME)					      \
+	do {								      \
+		BUILD_BUG_ON(FIELD_SIZEOF(struct tcp_sock, FIELD_NAME) != 4); \
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(			      \
+						struct bpf_sock_ops_kern,     \
+						is_fullsock),		      \
+				      si->dst_reg, si->src_reg,		      \
+				      offsetof(struct bpf_sock_ops_kern,      \
+					       is_fullsock));		      \
+		*insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 2);	      \
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(			      \
+						struct bpf_sock_ops_kern, sk),\
+				      si->dst_reg, si->src_reg,		      \
+				      offsetof(struct bpf_sock_ops_kern, sk));\
+		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,        \
+				      offsetof(struct tcp_sock, FIELD_NAME)); \
+	} while (0)
+
+	case offsetof(struct bpf_sock_ops, snd_cwnd):
+		SOCK_OPS_GET_TCP32(snd_cwnd);
+		break;
+
+	case offsetof(struct bpf_sock_ops, srtt_us):
+		SOCK_OPS_GET_TCP32(srtt_us);
+		break;
 	}
 	return insn - insn_buf;
 }
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 15ce30063765..cc75488d3653 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -133,10 +133,10 @@ skb_flow_dissect_set_enc_addr_type(enum flow_dissector_key_id type,
 	ctrl->addr_type = type;
 }
 
-static void
-__skb_flow_dissect_tunnel_info(const struct sk_buff *skb,
-			       struct flow_dissector *flow_dissector,
-			       void *target_container)
+void
+skb_flow_dissect_tunnel_info(const struct sk_buff *skb,
+			     struct flow_dissector *flow_dissector,
+			     void *target_container)
 {
 	struct ip_tunnel_info *info;
 	struct ip_tunnel_key *key;
@@ -212,6 +212,7 @@ __skb_flow_dissect_tunnel_info(const struct sk_buff *skb,
 		tp->dst = key->tp_dst;
 	}
 }
+EXPORT_SYMBOL(skb_flow_dissect_tunnel_info);
 
 static enum flow_dissect_ret
 __skb_flow_dissect_mpls(const struct sk_buff *skb,
@@ -576,9 +577,6 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 					      FLOW_DISSECTOR_KEY_BASIC,
 					      target_container);
 
-	__skb_flow_dissect_tunnel_info(skb, flow_dissector,
-				       target_container);
-
 	if (dissector_uses_key(flow_dissector,
 			       FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
 		struct ethhdr *eth = eth_hdr(skb);
diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c
index 87f28557b329..b2b2323bdc84 100644
--- a/net/core/gen_stats.c
+++ b/net/core/gen_stats.c
@@ -252,10 +252,10 @@ __gnet_stats_copy_queue_cpu(struct gnet_stats_queue *qstats,
 	}
 }
 
-static void __gnet_stats_copy_queue(struct gnet_stats_queue *qstats,
-				    const struct gnet_stats_queue __percpu *cpu,
-				    const struct gnet_stats_queue *q,
-				    __u32 qlen)
+void __gnet_stats_copy_queue(struct gnet_stats_queue *qstats,
+			     const struct gnet_stats_queue __percpu *cpu,
+			     const struct gnet_stats_queue *q,
+			     __u32 qlen)
 {
 	if (cpu) {
 		__gnet_stats_copy_queue_cpu(qstats, cpu);
@@ -269,6 +269,7 @@ static void __gnet_stats_copy_queue(struct gnet_stats_queue *qstats,
 
 	qstats->qlen = qlen;
 }
+EXPORT_SYMBOL(__gnet_stats_copy_queue);
 
 /**
  * gnet_stats_copy_queue - copy queue statistics into statistics TLV
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index f95a15086225..b9ce241cd28c 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -399,7 +399,7 @@ struct pktgen_dev {
 	__u8	ipsmode;		/* IPSEC mode (config) */
 	__u8	ipsproto;		/* IPSEC type (config) */
 	__u32	spi;
-	struct dst_entry dst;
+	struct xfrm_dst xdst;
 	struct dst_ops dstops;
 #endif
 	char result[512];
@@ -2609,7 +2609,7 @@ static int pktgen_output_ipsec(struct sk_buff *skb, struct pktgen_dev *pkt_dev)
 	 * supports both transport/tunnel mode + ESP/AH type.
 	 */
 	if ((x->props.mode == XFRM_MODE_TUNNEL) && (pkt_dev->spi != 0))
-		skb->_skb_refdst = (unsigned long)&pkt_dev->dst | SKB_DST_NOREF;
+		skb->_skb_refdst = (unsigned long)&pkt_dev->xdst.u.dst | SKB_DST_NOREF;
 
 	rcu_read_lock_bh();
 	err = x->outer_mode->output(x, skb);
@@ -3742,10 +3742,10 @@ static int pktgen_add_device(struct pktgen_thread *t, const char *ifname)
 	 * performance under such circumstance.
 	 */
 	pkt_dev->dstops.family = AF_INET;
-	pkt_dev->dst.dev = pkt_dev->odev;
-	dst_init_metrics(&pkt_dev->dst, pktgen_dst_metrics, false);
-	pkt_dev->dst.child = &pkt_dev->dst;
-	pkt_dev->dst.ops = &pkt_dev->dstops;
+	pkt_dev->xdst.u.dst.dev = pkt_dev->odev;
+	dst_init_metrics(&pkt_dev->xdst.u.dst, pktgen_dst_metrics, false);
+	pkt_dev->xdst.child = &pkt_dev->xdst.u.dst;
+	pkt_dev->xdst.u.dst.ops = &pkt_dev->dstops;
 #endif
 
 	return add_dev_to_thread(t, pkt_dev);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index dabba2a91fc8..412ebf0b09c6 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -62,7 +62,9 @@
 struct rtnl_link {
 	rtnl_doit_func		doit;
 	rtnl_dumpit_func	dumpit;
+	struct module		*owner;
 	unsigned int		flags;
+	struct rcu_head		rcu;
 };
 
 static DEFINE_MUTEX(rtnl_mutex);
@@ -127,8 +129,7 @@ bool lockdep_rtnl_is_held(void)
 EXPORT_SYMBOL(lockdep_rtnl_is_held);
 #endif /* #ifdef CONFIG_PROVE_LOCKING */
 
-static struct rtnl_link __rcu *rtnl_msg_handlers[RTNL_FAMILY_MAX + 1];
-static refcount_t rtnl_msg_handlers_ref[RTNL_FAMILY_MAX + 1];
+static struct rtnl_link *__rcu *rtnl_msg_handlers[RTNL_FAMILY_MAX + 1];
 
 static inline int rtm_msgindex(int msgtype)
 {
@@ -144,72 +145,127 @@ static inline int rtm_msgindex(int msgtype)
 	return msgindex;
 }
 
-/**
- * __rtnl_register - Register a rtnetlink message type
- * @protocol: Protocol family or PF_UNSPEC
- * @msgtype: rtnetlink message type
- * @doit: Function pointer called for each request message
- * @dumpit: Function pointer called for each dump request (NLM_F_DUMP) message
- * @flags: rtnl_link_flags to modifiy behaviour of doit/dumpit functions
- *
- * Registers the specified function pointers (at least one of them has
- * to be non-NULL) to be called whenever a request message for the
- * specified protocol family and message type is received.
- *
- * The special protocol family PF_UNSPEC may be used to define fallback
- * function pointers for the case when no entry for the specific protocol
- * family exists.
- *
- * Returns 0 on success or a negative error code.
- */
-int __rtnl_register(int protocol, int msgtype,
-		    rtnl_doit_func doit, rtnl_dumpit_func dumpit,
-		    unsigned int flags)
+static struct rtnl_link *rtnl_get_link(int protocol, int msgtype)
+{
+	struct rtnl_link **tab;
+
+	if (protocol >= ARRAY_SIZE(rtnl_msg_handlers))
+		protocol = PF_UNSPEC;
+
+	tab = rcu_dereference_rtnl(rtnl_msg_handlers[protocol]);
+	if (!tab)
+		tab = rcu_dereference_rtnl(rtnl_msg_handlers[PF_UNSPEC]);
+
+	return tab[msgtype];
+}
+
+static int rtnl_register_internal(struct module *owner,
+				  int protocol, int msgtype,
+				  rtnl_doit_func doit, rtnl_dumpit_func dumpit,
+				  unsigned int flags)
 {
-	struct rtnl_link *tab;
+	struct rtnl_link *link, *old;
+	struct rtnl_link __rcu **tab;
 	int msgindex;
+	int ret = -ENOBUFS;
 
 	BUG_ON(protocol < 0 || protocol > RTNL_FAMILY_MAX);
 	msgindex = rtm_msgindex(msgtype);
 
-	tab = rcu_dereference_raw(rtnl_msg_handlers[protocol]);
+	rtnl_lock();
+	tab = rtnl_msg_handlers[protocol];
 	if (tab == NULL) {
-		tab = kcalloc(RTM_NR_MSGTYPES, sizeof(*tab), GFP_KERNEL);
-		if (tab == NULL)
-			return -ENOBUFS;
+		tab = kcalloc(RTM_NR_MSGTYPES, sizeof(void *), GFP_KERNEL);
+		if (!tab)
+			goto unlock;
 
+		/* ensures we see the 0 stores */
 		rcu_assign_pointer(rtnl_msg_handlers[protocol], tab);
 	}
 
+	old = rtnl_dereference(tab[msgindex]);
+	if (old) {
+		link = kmemdup(old, sizeof(*old), GFP_KERNEL);
+		if (!link)
+			goto unlock;
+	} else {
+		link = kzalloc(sizeof(*link), GFP_KERNEL);
+		if (!link)
+			goto unlock;
+	}
+
+	WARN_ON(link->owner && link->owner != owner);
+	link->owner = owner;
+
+	WARN_ON(doit && link->doit && link->doit != doit);
 	if (doit)
-		tab[msgindex].doit = doit;
+		link->doit = doit;
+	WARN_ON(dumpit && link->dumpit && link->dumpit != dumpit);
 	if (dumpit)
-		tab[msgindex].dumpit = dumpit;
-	tab[msgindex].flags |= flags;
+		link->dumpit = dumpit;
 
-	return 0;
+	link->flags |= flags;
+
+	/* publish protocol:msgtype */
+	rcu_assign_pointer(tab[msgindex], link);
+	ret = 0;
+	if (old)
+		kfree_rcu(old, rcu);
+unlock:
+	rtnl_unlock();
+	return ret;
+}
+
+/**
+ * rtnl_register_module - Register a rtnetlink message type
+ *
+ * @owner: module registering the hook (THIS_MODULE)
+ * @protocol: Protocol family or PF_UNSPEC
+ * @msgtype: rtnetlink message type
+ * @doit: Function pointer called for each request message
+ * @dumpit: Function pointer called for each dump request (NLM_F_DUMP) message
+ * @flags: rtnl_link_flags to modifiy behaviour of doit/dumpit functions
+ *
+ * Like rtnl_register, but for use by removable modules.
+ */
+int rtnl_register_module(struct module *owner,
+			 int protocol, int msgtype,
+			 rtnl_doit_func doit, rtnl_dumpit_func dumpit,
+			 unsigned int flags)
+{
+	return rtnl_register_internal(owner, protocol, msgtype,
+				      doit, dumpit, flags);
 }
-EXPORT_SYMBOL_GPL(__rtnl_register);
+EXPORT_SYMBOL_GPL(rtnl_register_module);
 
 /**
  * rtnl_register - Register a rtnetlink message type
+ * @protocol: Protocol family or PF_UNSPEC
+ * @msgtype: rtnetlink message type
+ * @doit: Function pointer called for each request message
+ * @dumpit: Function pointer called for each dump request (NLM_F_DUMP) message
+ * @flags: rtnl_link_flags to modifiy behaviour of doit/dumpit functions
  *
- * Identical to __rtnl_register() but panics on failure. This is useful
- * as failure of this function is very unlikely, it can only happen due
- * to lack of memory when allocating the chain to store all message
- * handlers for a protocol. Meant for use in init functions where lack
- * of memory implies no sense in continuing.
+ * Registers the specified function pointers (at least one of them has
+ * to be non-NULL) to be called whenever a request message for the
+ * specified protocol family and message type is received.
+ *
+ * The special protocol family PF_UNSPEC may be used to define fallback
+ * function pointers for the case when no entry for the specific protocol
+ * family exists.
  */
 void rtnl_register(int protocol, int msgtype,
 		   rtnl_doit_func doit, rtnl_dumpit_func dumpit,
 		   unsigned int flags)
 {
-	if (__rtnl_register(protocol, msgtype, doit, dumpit, flags) < 0)
-		panic("Unable to register rtnetlink message handler, "
-		      "protocol = %d, message type = %d\n",
-		      protocol, msgtype);
+	int err;
+
+	err = rtnl_register_internal(NULL, protocol, msgtype, doit, dumpit,
+				     flags);
+	if (err)
+		pr_err("Unable to register rtnetlink message handler, "
+		       "protocol = %d, message type = %d\n", protocol, msgtype);
 }
-EXPORT_SYMBOL_GPL(rtnl_register);
 
 /**
  * rtnl_unregister - Unregister a rtnetlink message type
@@ -220,24 +276,25 @@ EXPORT_SYMBOL_GPL(rtnl_register);
  */
 int rtnl_unregister(int protocol, int msgtype)
 {
-	struct rtnl_link *handlers;
+	struct rtnl_link **tab, *link;
 	int msgindex;
 
 	BUG_ON(protocol < 0 || protocol > RTNL_FAMILY_MAX);
 	msgindex = rtm_msgindex(msgtype);
 
 	rtnl_lock();
-	handlers = rtnl_dereference(rtnl_msg_handlers[protocol]);
-	if (!handlers) {
+	tab = rtnl_dereference(rtnl_msg_handlers[protocol]);
+	if (!tab) {
 		rtnl_unlock();
 		return -ENOENT;
 	}
 
-	handlers[msgindex].doit = NULL;
-	handlers[msgindex].dumpit = NULL;
-	handlers[msgindex].flags = 0;
+	link = tab[msgindex];
+	rcu_assign_pointer(tab[msgindex], NULL);
 	rtnl_unlock();
 
+	kfree_rcu(link, rcu);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(rtnl_unregister);
@@ -251,20 +308,27 @@ EXPORT_SYMBOL_GPL(rtnl_unregister);
  */
 void rtnl_unregister_all(int protocol)
 {
-	struct rtnl_link *handlers;
+	struct rtnl_link **tab, *link;
+	int msgindex;
 
 	BUG_ON(protocol < 0 || protocol > RTNL_FAMILY_MAX);
 
 	rtnl_lock();
-	handlers = rtnl_dereference(rtnl_msg_handlers[protocol]);
+	tab = rtnl_msg_handlers[protocol];
 	RCU_INIT_POINTER(rtnl_msg_handlers[protocol], NULL);
+	for (msgindex = 0; msgindex < RTM_NR_MSGTYPES; msgindex++) {
+		link = tab[msgindex];
+		if (!link)
+			continue;
+
+		rcu_assign_pointer(tab[msgindex], NULL);
+		kfree_rcu(link, rcu);
+	}
 	rtnl_unlock();
 
 	synchronize_net();
 
-	while (refcount_read(&rtnl_msg_handlers_ref[protocol]) > 1)
-		schedule();
-	kfree(handlers);
+	kfree(tab);
 }
 EXPORT_SYMBOL_GPL(rtnl_unregister_all);
 
@@ -1261,6 +1325,7 @@ static u8 rtnl_xdp_attached_mode(struct net_device *dev, u32 *prog_id)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
 	const struct bpf_prog *generic_xdp_prog;
+	struct netdev_bpf xdp;
 
 	ASSERT_RTNL();
 
@@ -1273,7 +1338,10 @@ static u8 rtnl_xdp_attached_mode(struct net_device *dev, u32 *prog_id)
 	if (!ops->ndo_bpf)
 		return XDP_ATTACHED_NONE;
 
-	return __dev_xdp_attached(dev, ops->ndo_bpf, prog_id);
+	__dev_xdp_query(dev, ops->ndo_bpf, &xdp);
+	*prog_id = xdp.prog_id;
+
+	return xdp.prog_attached;
 }
 
 static int rtnl_xdp_fill(struct sk_buff *skb, struct net_device *dev)
@@ -1569,6 +1637,8 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_PROMISCUITY]	= { .type = NLA_U32 },
 	[IFLA_NUM_TX_QUEUES]	= { .type = NLA_U32 },
 	[IFLA_NUM_RX_QUEUES]	= { .type = NLA_U32 },
+	[IFLA_GSO_MAX_SEGS]	= { .type = NLA_U32 },
+	[IFLA_GSO_MAX_SIZE]	= { .type = NLA_U32 },
 	[IFLA_PHYS_PORT_ID]	= { .type = NLA_BINARY, .len = MAX_PHYS_ITEM_ID_LEN },
 	[IFLA_CARRIER_CHANGES]	= { .type = NLA_U32 },  /* ignored */
 	[IFLA_PHYS_SWITCH_ID]	= { .type = NLA_BINARY, .len = MAX_PHYS_ITEM_ID_LEN },
@@ -2219,6 +2289,34 @@ static int do_setlink(const struct sk_buff *skb,
 		}
 	}
 
+	if (tb[IFLA_GSO_MAX_SIZE]) {
+		u32 max_size = nla_get_u32(tb[IFLA_GSO_MAX_SIZE]);
+
+		if (max_size > GSO_MAX_SIZE) {
+			err = -EINVAL;
+			goto errout;
+		}
+
+		if (dev->gso_max_size ^ max_size) {
+			netif_set_gso_max_size(dev, max_size);
+			status |= DO_SETLINK_MODIFIED;
+		}
+	}
+
+	if (tb[IFLA_GSO_MAX_SEGS]) {
+		u32 max_segs = nla_get_u32(tb[IFLA_GSO_MAX_SEGS]);
+
+		if (max_segs > GSO_MAX_SEGS) {
+			err = -EINVAL;
+			goto errout;
+		}
+
+		if (dev->gso_max_segs ^ max_segs) {
+			dev->gso_max_segs = max_segs;
+			status |= DO_SETLINK_MODIFIED;
+		}
+	}
+
 	if (tb[IFLA_OPERSTATE])
 		set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE]));
 
@@ -2583,6 +2681,10 @@ struct net_device *rtnl_create_link(struct net *net,
 		dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]);
 	if (tb[IFLA_GROUP])
 		dev_set_group(dev, nla_get_u32(tb[IFLA_GROUP]));
+	if (tb[IFLA_GSO_MAX_SIZE])
+		netif_set_gso_max_size(dev, nla_get_u32(tb[IFLA_GSO_MAX_SIZE]));
+	if (tb[IFLA_GSO_MAX_SEGS])
+		dev->gso_max_size = nla_get_u32(tb[IFLA_GSO_MAX_SEGS]);
 
 	return dev;
 }
@@ -2973,18 +3075,26 @@ static int rtnl_dump_all(struct sk_buff *skb, struct netlink_callback *cb)
 		s_idx = 1;
 
 	for (idx = 1; idx <= RTNL_FAMILY_MAX; idx++) {
+		struct rtnl_link **tab;
 		int type = cb->nlh->nlmsg_type-RTM_BASE;
-		struct rtnl_link *handlers;
+		struct rtnl_link *link;
 		rtnl_dumpit_func dumpit;
 
 		if (idx < s_idx || idx == PF_PACKET)
 			continue;
 
-		handlers = rtnl_dereference(rtnl_msg_handlers[idx]);
-		if (!handlers)
+		if (type < 0 || type >= RTM_NR_MSGTYPES)
 			continue;
 
-		dumpit = READ_ONCE(handlers[type].dumpit);
+		tab = rcu_dereference_rtnl(rtnl_msg_handlers[idx]);
+		if (!tab)
+			continue;
+
+		link = tab[type];
+		if (!link)
+			continue;
+
+		dumpit = link->dumpit;
 		if (!dumpit)
 			continue;
 
@@ -4314,7 +4424,8 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
 			     struct netlink_ext_ack *extack)
 {
 	struct net *net = sock_net(skb->sk);
-	struct rtnl_link *handlers;
+	struct rtnl_link *link;
+	struct module *owner;
 	int err = -EOPNOTSUPP;
 	rtnl_doit_func doit;
 	unsigned int flags;
@@ -4338,79 +4449,85 @@ static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (kind != 2 && !netlink_net_capable(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
-	if (family >= ARRAY_SIZE(rtnl_msg_handlers))
-		family = PF_UNSPEC;
-
 	rcu_read_lock();
-	handlers = rcu_dereference(rtnl_msg_handlers[family]);
-	if (!handlers) {
-		family = PF_UNSPEC;
-		handlers = rcu_dereference(rtnl_msg_handlers[family]);
-	}
-
 	if (kind == 2 && nlh->nlmsg_flags&NLM_F_DUMP) {
 		struct sock *rtnl;
 		rtnl_dumpit_func dumpit;
 		u16 min_dump_alloc = 0;
 
-		dumpit = READ_ONCE(handlers[type].dumpit);
-		if (!dumpit) {
+		link = rtnl_get_link(family, type);
+		if (!link || !link->dumpit) {
 			family = PF_UNSPEC;
-			handlers = rcu_dereference(rtnl_msg_handlers[PF_UNSPEC]);
-			if (!handlers)
-				goto err_unlock;
-
-			dumpit = READ_ONCE(handlers[type].dumpit);
-			if (!dumpit)
+			link = rtnl_get_link(family, type);
+			if (!link || !link->dumpit)
 				goto err_unlock;
 		}
-
-		refcount_inc(&rtnl_msg_handlers_ref[family]);
+		owner = link->owner;
+		dumpit = link->dumpit;
 
 		if (type == RTM_GETLINK - RTM_BASE)
 			min_dump_alloc = rtnl_calcit(skb, nlh);
 
+		err = 0;
+		/* need to do this before rcu_read_unlock() */
+		if (!try_module_get(owner))
+			err = -EPROTONOSUPPORT;
+
 		rcu_read_unlock();
 
 		rtnl = net->rtnl;
-		{
+		if (err == 0) {
 			struct netlink_dump_control c = {
 				.dump		= dumpit,
 				.min_dump_alloc	= min_dump_alloc,
+				.module		= owner,
 			};
 			err = netlink_dump_start(rtnl, skb, nlh, &c);
+			/* netlink_dump_start() will keep a reference on
+			 * module if dump is still in progress.
+			 */
+			module_put(owner);
 		}
-		refcount_dec(&rtnl_msg_handlers_ref[family]);
 		return err;
 	}
 
-	doit = READ_ONCE(handlers[type].doit);
-	if (!doit) {
+	link = rtnl_get_link(family, type);
+	if (!link || !link->doit) {
 		family = PF_UNSPEC;
-		handlers = rcu_dereference(rtnl_msg_handlers[family]);
+		link = rtnl_get_link(PF_UNSPEC, type);
+		if (!link || !link->doit)
+			goto out_unlock;
 	}
 
-	flags = READ_ONCE(handlers[type].flags);
+	owner = link->owner;
+	if (!try_module_get(owner)) {
+		err = -EPROTONOSUPPORT;
+		goto out_unlock;
+	}
+
+	flags = link->flags;
 	if (flags & RTNL_FLAG_DOIT_UNLOCKED) {
-		refcount_inc(&rtnl_msg_handlers_ref[family]);
-		doit = READ_ONCE(handlers[type].doit);
+		doit = link->doit;
 		rcu_read_unlock();
 		if (doit)
 			err = doit(skb, nlh, extack);
-		refcount_dec(&rtnl_msg_handlers_ref[family]);
+		module_put(owner);
 		return err;
 	}
-
 	rcu_read_unlock();
 
 	rtnl_lock();
-	handlers = rtnl_dereference(rtnl_msg_handlers[family]);
-	if (handlers) {
-		doit = READ_ONCE(handlers[type].doit);
-		if (doit)
-			err = doit(skb, nlh, extack);
-	}
+	link = rtnl_get_link(family, type);
+	if (link && link->doit)
+		err = link->doit(skb, nlh, extack);
 	rtnl_unlock();
+
+	module_put(owner);
+
+	return err;
+
+out_unlock:
+	rcu_read_unlock();
 	return err;
 
 err_unlock:
@@ -4498,11 +4615,6 @@ static struct pernet_operations rtnetlink_net_ops = {
 
 void __init rtnetlink_init(void)
 {
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(rtnl_msg_handlers_ref); i++)
-		refcount_set(&rtnl_msg_handlers_ref[i], 1);
-
 	if (register_pernet_subsys(&rtnetlink_net_ops))
 		panic("rtnetlink_init: cannot initialize rtnetlink\n");
 
diff --git a/net/core/sock_reuseport.c b/net/core/sock_reuseport.c
index 5eeb1d20cc38..c5bb52bc73a1 100644
--- a/net/core/sock_reuseport.c
+++ b/net/core/sock_reuseport.c
@@ -235,7 +235,9 @@ struct sock *reuseport_select_sock(struct sock *sk,
 
 		if (prog && skb)
 			sk2 = run_bpf(reuse, socks, prog, skb, hdr_len);
-		else
+
+		/* no bpf or invalid bpf result: fall back to hash usage */
+		if (!sk2)
 			sk2 = reuse->socks[reciprocal_scale(hash, socks)];
 	}
 
diff --git a/net/decnet/dn_dev.c b/net/decnet/dn_dev.c
index 9153247dad28..d1885cf59319 100644
--- a/net/decnet/dn_dev.c
+++ b/net/decnet/dn_dev.c
@@ -1418,9 +1418,12 @@ void __init dn_dev_init(void)
 
 	dn_dev_devices_on();
 
-	rtnl_register(PF_DECnet, RTM_NEWADDR, dn_nl_newaddr, NULL, 0);
-	rtnl_register(PF_DECnet, RTM_DELADDR, dn_nl_deladdr, NULL, 0);
-	rtnl_register(PF_DECnet, RTM_GETADDR, NULL, dn_nl_dump_ifaddr, 0);
+	rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_NEWADDR,
+			     dn_nl_newaddr, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_DELADDR,
+			     dn_nl_deladdr, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_GETADDR,
+			     NULL, dn_nl_dump_ifaddr, 0);
 
 	proc_create("decnet_dev", S_IRUGO, init_net.proc_net, &dn_dev_seq_fops);
 
diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c
index b37a1b833c77..fce94cbd4378 100644
--- a/net/decnet/dn_fib.c
+++ b/net/decnet/dn_fib.c
@@ -792,8 +792,10 @@ void __init dn_fib_init(void)
 
 	register_dnaddr_notifier(&dn_fib_dnaddr_notifier);
 
-	rtnl_register(PF_DECnet, RTM_NEWROUTE, dn_fib_rtm_newroute, NULL, 0);
-	rtnl_register(PF_DECnet, RTM_DELROUTE, dn_fib_rtm_delroute, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_NEWROUTE,
+			     dn_fib_rtm_newroute, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_DELROUTE,
+			     dn_fib_rtm_delroute, NULL, 0);
 }
 
 
diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c
index 324cb9f2f551..73160d4aebbe 100644
--- a/net/decnet/dn_route.c
+++ b/net/decnet/dn_route.c
@@ -199,11 +199,11 @@ static void dn_dst_check_expire(struct timer_list *unused)
 						lockdep_is_held(&dn_rt_hash_table[i].lock))) != NULL) {
 			if (atomic_read(&rt->dst.__refcnt) > 1 ||
 			    (now - rt->dst.lastuse) < expire) {
-				rtp = &rt->dst.dn_next;
+				rtp = &rt->dn_next;
 				continue;
 			}
-			*rtp = rt->dst.dn_next;
-			rt->dst.dn_next = NULL;
+			*rtp = rt->dn_next;
+			rt->dn_next = NULL;
 			dst_dev_put(&rt->dst);
 			dst_release(&rt->dst);
 		}
@@ -233,11 +233,11 @@ static int dn_dst_gc(struct dst_ops *ops)
 						lockdep_is_held(&dn_rt_hash_table[i].lock))) != NULL) {
 			if (atomic_read(&rt->dst.__refcnt) > 1 ||
 			    (now - rt->dst.lastuse) < expire) {
-				rtp = &rt->dst.dn_next;
+				rtp = &rt->dn_next;
 				continue;
 			}
-			*rtp = rt->dst.dn_next;
-			rt->dst.dn_next = NULL;
+			*rtp = rt->dn_next;
+			rt->dn_next = NULL;
 			dst_dev_put(&rt->dst);
 			dst_release(&rt->dst);
 			break;
@@ -333,8 +333,8 @@ static int dn_insert_route(struct dn_route *rt, unsigned int hash, struct dn_rou
 						lockdep_is_held(&dn_rt_hash_table[hash].lock))) != NULL) {
 		if (compare_keys(&rth->fld, &rt->fld)) {
 			/* Put it first */
-			*rthp = rth->dst.dn_next;
-			rcu_assign_pointer(rth->dst.dn_next,
+			*rthp = rth->dn_next;
+			rcu_assign_pointer(rth->dn_next,
 					   dn_rt_hash_table[hash].chain);
 			rcu_assign_pointer(dn_rt_hash_table[hash].chain, rth);
 
@@ -345,10 +345,10 @@ static int dn_insert_route(struct dn_route *rt, unsigned int hash, struct dn_rou
 			*rp = rth;
 			return 0;
 		}
-		rthp = &rth->dst.dn_next;
+		rthp = &rth->dn_next;
 	}
 
-	rcu_assign_pointer(rt->dst.dn_next, dn_rt_hash_table[hash].chain);
+	rcu_assign_pointer(rt->dn_next, dn_rt_hash_table[hash].chain);
 	rcu_assign_pointer(dn_rt_hash_table[hash].chain, rt);
 
 	dst_hold_and_use(&rt->dst, now);
@@ -369,8 +369,8 @@ static void dn_run_flush(struct timer_list *unused)
 			goto nothing_to_declare;
 
 		for(; rt; rt = next) {
-			next = rcu_dereference_raw(rt->dst.dn_next);
-			RCU_INIT_POINTER(rt->dst.dn_next, NULL);
+			next = rcu_dereference_raw(rt->dn_next);
+			RCU_INIT_POINTER(rt->dn_next, NULL);
 			dst_dev_put(&rt->dst);
 			dst_release(&rt->dst);
 		}
@@ -1183,6 +1183,7 @@ make_route:
 	if (rt == NULL)
 		goto e_nobufs;
 
+	rt->dn_next = NULL;
 	memset(&rt->fld, 0, sizeof(rt->fld));
 	rt->fld.saddr        = oldflp->saddr;
 	rt->fld.daddr        = oldflp->daddr;
@@ -1252,7 +1253,7 @@ static int __dn_route_output_key(struct dst_entry **pprt, const struct flowidn *
 	if (!(flags & MSG_TRYHARD)) {
 		rcu_read_lock_bh();
 		for (rt = rcu_dereference_bh(dn_rt_hash_table[hash].chain); rt;
-			rt = rcu_dereference_bh(rt->dst.dn_next)) {
+			rt = rcu_dereference_bh(rt->dn_next)) {
 			if ((flp->daddr == rt->fld.daddr) &&
 			    (flp->saddr == rt->fld.saddr) &&
 			    (flp->flowidn_mark == rt->fld.flowidn_mark) &&
@@ -1448,6 +1449,7 @@ make_route:
 	if (rt == NULL)
 		goto e_nobufs;
 
+	rt->dn_next = NULL;
 	memset(&rt->fld, 0, sizeof(rt->fld));
 	rt->rt_saddr      = fld.saddr;
 	rt->rt_daddr      = fld.daddr;
@@ -1529,7 +1531,7 @@ static int dn_route_input(struct sk_buff *skb)
 
 	rcu_read_lock();
 	for(rt = rcu_dereference(dn_rt_hash_table[hash].chain); rt != NULL;
-	    rt = rcu_dereference(rt->dst.dn_next)) {
+	    rt = rcu_dereference(rt->dn_next)) {
 		if ((rt->fld.saddr == cb->src) &&
 		    (rt->fld.daddr == cb->dst) &&
 		    (rt->fld.flowidn_oif == 0) &&
@@ -1749,7 +1751,7 @@ int dn_cache_dump(struct sk_buff *skb, struct netlink_callback *cb)
 		rcu_read_lock_bh();
 		for(rt = rcu_dereference_bh(dn_rt_hash_table[h].chain), idx = 0;
 			rt;
-			rt = rcu_dereference_bh(rt->dst.dn_next), idx++) {
+			rt = rcu_dereference_bh(rt->dn_next), idx++) {
 			if (idx < s_idx)
 				continue;
 			skb_dst_set(skb, dst_clone(&rt->dst));
@@ -1795,7 +1797,7 @@ static struct dn_route *dn_rt_cache_get_next(struct seq_file *seq, struct dn_rou
 {
 	struct dn_rt_cache_iter_state *s = seq->private;
 
-	rt = rcu_dereference_bh(rt->dst.dn_next);
+	rt = rcu_dereference_bh(rt->dn_next);
 	while (!rt) {
 		rcu_read_unlock_bh();
 		if (--s->bucket < 0)
@@ -1921,11 +1923,11 @@ void __init dn_route_init(void)
 		    &dn_rt_cache_seq_fops);
 
 #ifdef CONFIG_DECNET_ROUTER
-	rtnl_register(PF_DECnet, RTM_GETROUTE, dn_cache_getroute,
-		      dn_fib_dump, 0);
+	rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_GETROUTE,
+			     dn_cache_getroute, dn_fib_dump, 0);
 #else
-	rtnl_register(PF_DECnet, RTM_GETROUTE, dn_cache_getroute,
-		      dn_cache_dump, 0);
+	rtnl_register_module(THIS_MODULE, PF_DECnet, RTM_GETROUTE,
+			     dn_cache_getroute, dn_cache_dump, 0);
 #endif
 }
 
diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
index 03c3bdf25468..bbf2c82cf7b2 100644
--- a/net/dsa/Kconfig
+++ b/net/dsa/Kconfig
@@ -16,6 +16,15 @@ config NET_DSA
 
 if NET_DSA
 
+config NET_DSA_LEGACY
+	bool "Support for older platform device and Device Tree registration"
+	default y
+	---help---
+	  Say Y if you want to enable support for the older platform device and
+	  deprecated Device Tree binding registration.
+
+	  This feature is scheduled for removal in 4.17.
+
 # tagging formats
 config NET_DSA_TAG_BRCM
 	bool
diff --git a/net/dsa/Makefile b/net/dsa/Makefile
index 0e13c1f95d13..9e4d3536f977 100644
--- a/net/dsa/Makefile
+++ b/net/dsa/Makefile
@@ -1,7 +1,8 @@
 # SPDX-License-Identifier: GPL-2.0
 # the core
 obj-$(CONFIG_NET_DSA) += dsa_core.o
-dsa_core-y += dsa.o dsa2.o legacy.o master.o port.o slave.o switch.o
+dsa_core-y += dsa.o dsa2.o master.o port.o slave.o switch.o
+dsa_core-$(CONFIG_NET_DSA_LEGACY) += legacy.o
 
 # tagging formats
 dsa_core-$(CONFIG_NET_DSA_TAG_BRCM) += tag_brcm.o
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 1e287420ff49..21f9bed11988 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -241,7 +241,7 @@ static int dsa_tree_setup_default_cpu(struct dsa_switch_tree *dst)
 		for (port = 0; port < ds->num_ports; port++) {
 			dp = &ds->ports[port];
 
-			if (dsa_port_is_user(dp))
+			if (dsa_port_is_user(dp) || dsa_port_is_dsa(dp))
 				dp->cpu_dp = dst->cpu_dp;
 		}
 	}
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 7d036696e8c4..b03665e8fb4e 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -97,8 +97,17 @@ const struct dsa_device_ops *dsa_resolve_tag_protocol(int tag_protocol);
 bool dsa_schedule_work(struct work_struct *work);
 
 /* legacy.c */
+#if IS_ENABLED(CONFIG_NET_DSA_LEGACY)
 int dsa_legacy_register(void);
 void dsa_legacy_unregister(void);
+#else
+static inline int dsa_legacy_register(void)
+{
+	return -ENODEV;
+}
+
+static inline void dsa_legacy_unregister(void) { }
+#endif
 int dsa_legacy_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 		       struct net_device *dev,
 		       const unsigned char *addr, u16 vid,
diff --git a/net/dsa/legacy.c b/net/dsa/legacy.c
index 84611d7fcfa2..aa56d3fb5da4 100644
--- a/net/dsa/legacy.c
+++ b/net/dsa/legacy.c
@@ -718,26 +718,6 @@ static int dsa_resume(struct device *d)
 }
 #endif
 
-/* legacy way, bypassing the bridge *****************************************/
-int dsa_legacy_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
-		       struct net_device *dev,
-		       const unsigned char *addr, u16 vid,
-		       u16 flags)
-{
-	struct dsa_port *dp = dsa_slave_to_port(dev);
-
-	return dsa_port_fdb_add(dp, addr, vid);
-}
-
-int dsa_legacy_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
-		       struct net_device *dev,
-		       const unsigned char *addr, u16 vid)
-{
-	struct dsa_port *dp = dsa_slave_to_port(dev);
-
-	return dsa_port_fdb_del(dp, addr, vid);
-}
-
 static SIMPLE_DEV_PM_OPS(dsa_pm_ops, dsa_suspend, dsa_resume);
 
 static const struct of_device_id dsa_of_match_table[] = {
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index d6e7a642493b..5d6475a6cc5d 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -709,14 +709,12 @@ static int dsa_slave_add_cls_matchall(struct net_device *dev,
 	struct dsa_slave_priv *p = netdev_priv(dev);
 	struct dsa_mall_tc_entry *mall_tc_entry;
 	__be16 protocol = cls->common.protocol;
-	struct net *net = dev_net(dev);
 	struct dsa_switch *ds = dp->ds;
 	struct net_device *to_dev;
 	const struct tc_action *a;
 	struct dsa_port *to_dp;
 	int err = -EOPNOTSUPP;
 	LIST_HEAD(actions);
-	int ifindex;
 
 	if (!ds->ops->port_mirror_add)
 		return err;
@@ -730,8 +728,7 @@ static int dsa_slave_add_cls_matchall(struct net_device *dev,
 	if (is_tcf_mirred_egress_mirror(a) && protocol == htons(ETH_P_ALL)) {
 		struct dsa_mall_mirror_tc_entry *mirror;
 
-		ifindex = tcf_mirred_ifindex(a);
-		to_dev = __dev_get_by_index(net, ifindex);
+		to_dev = tcf_mirred_dev(a);
 		if (!to_dev)
 			return -EINVAL;
 
@@ -944,6 +941,26 @@ static const struct ethtool_ops dsa_slave_ethtool_ops = {
 	.set_rxnfc		= dsa_slave_set_rxnfc,
 };
 
+/* legacy way, bypassing the bridge *****************************************/
+int dsa_legacy_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
+		       struct net_device *dev,
+		       const unsigned char *addr, u16 vid,
+		       u16 flags)
+{
+	struct dsa_port *dp = dsa_slave_to_port(dev);
+
+	return dsa_port_fdb_add(dp, addr, vid);
+}
+
+int dsa_legacy_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
+		       struct net_device *dev,
+		       const unsigned char *addr, u16 vid)
+{
+	struct dsa_port *dp = dsa_slave_to_port(dev);
+
+	return dsa_port_fdb_del(dp, addr, vid);
+}
+
 static const struct net_device_ops dsa_slave_netdev_ops = {
 	.ndo_open	 	= dsa_slave_open,
 	.ndo_stop		= dsa_slave_close,
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 29608d087a7c..b93511726069 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -83,29 +83,52 @@ static int dsa_switch_bridge_leave(struct dsa_switch *ds,
 static int dsa_switch_fdb_add(struct dsa_switch *ds,
 			      struct dsa_notifier_fdb_info *info)
 {
-	/* Do not care yet about other switch chips of the fabric */
-	if (ds->index != info->sw_index)
-		return 0;
+	int port = dsa_towards_port(ds, info->sw_index, info->port);
 
 	if (!ds->ops->port_fdb_add)
 		return -EOPNOTSUPP;
 
-	return ds->ops->port_fdb_add(ds, info->port, info->addr,
-				     info->vid);
+	return ds->ops->port_fdb_add(ds, port, info->addr, info->vid);
 }
 
 static int dsa_switch_fdb_del(struct dsa_switch *ds,
 			      struct dsa_notifier_fdb_info *info)
 {
-	/* Do not care yet about other switch chips of the fabric */
-	if (ds->index != info->sw_index)
-		return 0;
+	int port = dsa_towards_port(ds, info->sw_index, info->port);
 
 	if (!ds->ops->port_fdb_del)
 		return -EOPNOTSUPP;
 
-	return ds->ops->port_fdb_del(ds, info->port, info->addr,
-				     info->vid);
+	return ds->ops->port_fdb_del(ds, port, info->addr, info->vid);
+}
+
+static int
+dsa_switch_mdb_prepare_bitmap(struct dsa_switch *ds,
+			      const struct switchdev_obj_port_mdb *mdb,
+			      const unsigned long *bitmap)
+{
+	int port, err;
+
+	if (!ds->ops->port_mdb_prepare || !ds->ops->port_mdb_add)
+		return -EOPNOTSUPP;
+
+	for_each_set_bit(port, bitmap, ds->num_ports) {
+		err = ds->ops->port_mdb_prepare(ds, port, mdb);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static void dsa_switch_mdb_add_bitmap(struct dsa_switch *ds,
+				      const struct switchdev_obj_port_mdb *mdb,
+				      const unsigned long *bitmap)
+{
+	int port;
+
+	for_each_set_bit(port, bitmap, ds->num_ports)
+		ds->ops->port_mdb_add(ds, port, mdb);
 }
 
 static int dsa_switch_mdb_add(struct dsa_switch *ds,
@@ -114,7 +137,7 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
 	const struct switchdev_obj_port_mdb *mdb = info->mdb;
 	struct switchdev_trans *trans = info->trans;
 	DECLARE_BITMAP(group, ds->num_ports);
-	int port, err;
+	int port;
 
 	/* Build a mask of Multicast group members */
 	bitmap_zero(group, ds->num_ports);
@@ -124,21 +147,10 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
 		if (dsa_is_dsa_port(ds, port))
 			set_bit(port, group);
 
-	if (switchdev_trans_ph_prepare(trans)) {
-		if (!ds->ops->port_mdb_prepare || !ds->ops->port_mdb_add)
-			return -EOPNOTSUPP;
-
-		for_each_set_bit(port, group, ds->num_ports) {
-			err = ds->ops->port_mdb_prepare(ds, port, mdb, trans);
-			if (err)
-				return err;
-		}
-
-		return 0;
-	}
+	if (switchdev_trans_ph_prepare(trans))
+		return dsa_switch_mdb_prepare_bitmap(ds, mdb, group);
 
-	for_each_set_bit(port, group, ds->num_ports)
-		ds->ops->port_mdb_add(ds, port, mdb, trans);
+	dsa_switch_mdb_add_bitmap(ds, mdb, group);
 
 	return 0;
 }
@@ -157,13 +169,43 @@ static int dsa_switch_mdb_del(struct dsa_switch *ds,
 	return 0;
 }
 
+static int
+dsa_switch_vlan_prepare_bitmap(struct dsa_switch *ds,
+			       const struct switchdev_obj_port_vlan *vlan,
+			       const unsigned long *bitmap)
+{
+	int port, err;
+
+	if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
+		return -EOPNOTSUPP;
+
+	for_each_set_bit(port, bitmap, ds->num_ports) {
+		err = ds->ops->port_vlan_prepare(ds, port, vlan);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static void
+dsa_switch_vlan_add_bitmap(struct dsa_switch *ds,
+			   const struct switchdev_obj_port_vlan *vlan,
+			   const unsigned long *bitmap)
+{
+	int port;
+
+	for_each_set_bit(port, bitmap, ds->num_ports)
+		ds->ops->port_vlan_add(ds, port, vlan);
+}
+
 static int dsa_switch_vlan_add(struct dsa_switch *ds,
 			       struct dsa_notifier_vlan_info *info)
 {
 	const struct switchdev_obj_port_vlan *vlan = info->vlan;
 	struct switchdev_trans *trans = info->trans;
 	DECLARE_BITMAP(members, ds->num_ports);
-	int port, err;
+	int port;
 
 	/* Build a mask of VLAN members */
 	bitmap_zero(members, ds->num_ports);
@@ -173,21 +215,10 @@ static int dsa_switch_vlan_add(struct dsa_switch *ds,
 		if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
 			set_bit(port, members);
 
-	if (switchdev_trans_ph_prepare(trans)) {
-		if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
-			return -EOPNOTSUPP;
-
-		for_each_set_bit(port, members, ds->num_ports) {
-			err = ds->ops->port_vlan_prepare(ds, port, vlan, trans);
-			if (err)
-				return err;
-		}
-
-		return 0;
-	}
+	if (switchdev_trans_ph_prepare(trans))
+		return dsa_switch_vlan_prepare_bitmap(ds, vlan, members);
 
-	for_each_set_bit(port, members, ds->num_ports)
-		ds->ops->port_vlan_add(ds, port, vlan, trans);
+	dsa_switch_vlan_add_bitmap(ds, vlan, members);
 
 	return 0;
 }
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index e7d15fb0d94d..f6f58108b4c5 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -19,6 +19,7 @@
 #include <linux/slab.h>
 #include <linux/wait.h>
 #include <linux/vmalloc.h>
+#include <linux/bootmem.h>
 
 #include <net/addrconf.h>
 #include <net/inet_connection_sock.h>
@@ -168,6 +169,60 @@ int __inet_inherit_port(const struct sock *sk, struct sock *child)
 }
 EXPORT_SYMBOL_GPL(__inet_inherit_port);
 
+static struct inet_listen_hashbucket *
+inet_lhash2_bucket_sk(struct inet_hashinfo *h, struct sock *sk)
+{
+	u32 hash;
+
+#if IS_ENABLED(CONFIG_IPV6)
+	if (sk->sk_family == AF_INET6)
+		hash = ipv6_portaddr_hash(sock_net(sk),
+					  &sk->sk_v6_rcv_saddr,
+					  inet_sk(sk)->inet_num);
+	else
+#endif
+		hash = ipv4_portaddr_hash(sock_net(sk),
+					  inet_sk(sk)->inet_rcv_saddr,
+					  inet_sk(sk)->inet_num);
+	return inet_lhash2_bucket(h, hash);
+}
+
+static void inet_hash2(struct inet_hashinfo *h, struct sock *sk)
+{
+	struct inet_listen_hashbucket *ilb2;
+
+	if (!h->lhash2)
+		return;
+
+	ilb2 = inet_lhash2_bucket_sk(h, sk);
+
+	spin_lock(&ilb2->lock);
+	if (sk->sk_reuseport && sk->sk_family == AF_INET6)
+		hlist_add_tail_rcu(&inet_csk(sk)->icsk_listen_portaddr_node,
+				   &ilb2->head);
+	else
+		hlist_add_head_rcu(&inet_csk(sk)->icsk_listen_portaddr_node,
+				   &ilb2->head);
+	ilb2->count++;
+	spin_unlock(&ilb2->lock);
+}
+
+static void inet_unhash2(struct inet_hashinfo *h, struct sock *sk)
+{
+	struct inet_listen_hashbucket *ilb2;
+
+	if (!h->lhash2 ||
+	    WARN_ON_ONCE(hlist_unhashed(&inet_csk(sk)->icsk_listen_portaddr_node)))
+		return;
+
+	ilb2 = inet_lhash2_bucket_sk(h, sk);
+
+	spin_lock(&ilb2->lock);
+	hlist_del_init_rcu(&inet_csk(sk)->icsk_listen_portaddr_node);
+	ilb2->count--;
+	spin_unlock(&ilb2->lock);
+}
+
 static inline int compute_score(struct sock *sk, struct net *net,
 				const unsigned short hnum, const __be32 daddr,
 				const int dif, const int sdif, bool exact_dif)
@@ -207,6 +262,40 @@ static inline int compute_score(struct sock *sk, struct net *net,
  */
 
 /* called with rcu_read_lock() : No refcount taken on the socket */
+static struct sock *inet_lhash2_lookup(struct net *net,
+				struct inet_listen_hashbucket *ilb2,
+				struct sk_buff *skb, int doff,
+				const __be32 saddr, __be16 sport,
+				const __be32 daddr, const unsigned short hnum,
+				const int dif, const int sdif)
+{
+	bool exact_dif = inet_exact_dif_match(net, skb);
+	struct inet_connection_sock *icsk;
+	struct sock *sk, *result = NULL;
+	int score, hiscore = 0;
+	u32 phash = 0;
+
+	inet_lhash2_for_each_icsk_rcu(icsk, &ilb2->head) {
+		sk = (struct sock *)icsk;
+		score = compute_score(sk, net, hnum, daddr,
+				      dif, sdif, exact_dif);
+		if (score > hiscore) {
+			if (sk->sk_reuseport) {
+				phash = inet_ehashfn(net, daddr, hnum,
+						     saddr, sport);
+				result = reuseport_select_sock(sk, phash,
+							       skb, doff);
+				if (result)
+					return result;
+			}
+			result = sk;
+			hiscore = score;
+		}
+	}
+
+	return result;
+}
+
 struct sock *__inet_lookup_listener(struct net *net,
 				    struct inet_hashinfo *hashinfo,
 				    struct sk_buff *skb, int doff,
@@ -216,32 +305,57 @@ struct sock *__inet_lookup_listener(struct net *net,
 {
 	unsigned int hash = inet_lhashfn(net, hnum);
 	struct inet_listen_hashbucket *ilb = &hashinfo->listening_hash[hash];
-	int score, hiscore = 0, matches = 0, reuseport = 0;
 	bool exact_dif = inet_exact_dif_match(net, skb);
+	struct inet_listen_hashbucket *ilb2;
 	struct sock *sk, *result = NULL;
+	int score, hiscore = 0;
+	unsigned int hash2;
 	u32 phash = 0;
 
+	if (ilb->count <= 10 || !hashinfo->lhash2)
+		goto port_lookup;
+
+	/* Too many sk in the ilb bucket (which is hashed by port alone).
+	 * Try lhash2 (which is hashed by port and addr) instead.
+	 */
+
+	hash2 = ipv4_portaddr_hash(net, daddr, hnum);
+	ilb2 = inet_lhash2_bucket(hashinfo, hash2);
+	if (ilb2->count > ilb->count)
+		goto port_lookup;
+
+	result = inet_lhash2_lookup(net, ilb2, skb, doff,
+				    saddr, sport, daddr, hnum,
+				    dif, sdif);
+	if (result)
+		return result;
+
+	/* Lookup lhash2 with INADDR_ANY */
+
+	hash2 = ipv4_portaddr_hash(net, htonl(INADDR_ANY), hnum);
+	ilb2 = inet_lhash2_bucket(hashinfo, hash2);
+	if (ilb2->count > ilb->count)
+		goto port_lookup;
+
+	return inet_lhash2_lookup(net, ilb2, skb, doff,
+				  saddr, sport, daddr, hnum,
+				  dif, sdif);
+
+port_lookup:
 	sk_for_each_rcu(sk, &ilb->head) {
 		score = compute_score(sk, net, hnum, daddr,
 				      dif, sdif, exact_dif);
 		if (score > hiscore) {
-			reuseport = sk->sk_reuseport;
-			if (reuseport) {
+			if (sk->sk_reuseport) {
 				phash = inet_ehashfn(net, daddr, hnum,
 						     saddr, sport);
 				result = reuseport_select_sock(sk, phash,
 							       skb, doff);
 				if (result)
 					return result;
-				matches = 1;
 			}
 			result = sk;
 			hiscore = score;
-		} else if (score == hiscore && reuseport) {
-			matches++;
-			if (reciprocal_scale(phash, matches) == 0)
-				result = sk;
-			phash = next_pseudo_random32(phash);
 		}
 	}
 	return result;
@@ -483,6 +597,8 @@ int __inet_hash(struct sock *sk, struct sock *osk)
 		hlist_add_tail_rcu(&sk->sk_node, &ilb->head);
 	else
 		hlist_add_head_rcu(&sk->sk_node, &ilb->head);
+	inet_hash2(hashinfo, sk);
+	ilb->count++;
 	sock_set_flag(sk, SOCK_RCU_FREE);
 	sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
 unlock:
@@ -509,28 +625,35 @@ EXPORT_SYMBOL_GPL(inet_hash);
 void inet_unhash(struct sock *sk)
 {
 	struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo;
+	struct inet_listen_hashbucket *ilb;
 	spinlock_t *lock;
 	bool listener = false;
-	int done;
 
 	if (sk_unhashed(sk))
 		return;
 
 	if (sk->sk_state == TCP_LISTEN) {
-		lock = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)].lock;
+		ilb = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)];
+		lock = &ilb->lock;
 		listener = true;
 	} else {
 		lock = inet_ehash_lockp(hashinfo, sk->sk_hash);
 	}
 	spin_lock_bh(lock);
+	if (sk_unhashed(sk))
+		goto unlock;
+
 	if (rcu_access_pointer(sk->sk_reuseport_cb))
 		reuseport_detach_sock(sk);
-	if (listener)
-		done = __sk_del_node_init(sk);
-	else
-		done = __sk_nulls_del_node_init_rcu(sk);
-	if (done)
-		sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
+	if (listener) {
+		inet_unhash2(hashinfo, sk);
+		 __sk_del_node_init(sk);
+		 ilb->count--;
+	} else {
+		__sk_nulls_del_node_init_rcu(sk);
+	}
+	sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
+unlock:
 	spin_unlock_bh(lock);
 }
 EXPORT_SYMBOL_GPL(inet_unhash);
@@ -665,10 +788,37 @@ void inet_hashinfo_init(struct inet_hashinfo *h)
 	for (i = 0; i < INET_LHTABLE_SIZE; i++) {
 		spin_lock_init(&h->listening_hash[i].lock);
 		INIT_HLIST_HEAD(&h->listening_hash[i].head);
+		h->listening_hash[i].count = 0;
 	}
+
+	h->lhash2 = NULL;
 }
 EXPORT_SYMBOL_GPL(inet_hashinfo_init);
 
+void __init inet_hashinfo2_init(struct inet_hashinfo *h, const char *name,
+				unsigned long numentries, int scale,
+				unsigned long low_limit,
+				unsigned long high_limit)
+{
+	unsigned int i;
+
+	h->lhash2 = alloc_large_system_hash(name,
+					    sizeof(*h->lhash2),
+					    numentries,
+					    scale,
+					    0,
+					    NULL,
+					    &h->lhash2_mask,
+					    low_limit,
+					    high_limit);
+
+	for (i = 0; i <= h->lhash2_mask; i++) {
+		spin_lock_init(&h->lhash2[i].lock);
+		INIT_HLIST_HEAD(&h->lhash2[i].head);
+		h->lhash2[i].count = 0;
+	}
+}
+
 int inet_ehash_locks_alloc(struct inet_hashinfo *hashinfo)
 {
 	unsigned int locksz = sizeof(spinlock_t);
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index bb6239169b1a..d828821d88d7 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -114,7 +114,8 @@ MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN");
 static struct rtnl_link_ops ipgre_link_ops __read_mostly;
 static int ipgre_tunnel_init(struct net_device *dev);
 static void erspan_build_header(struct sk_buff *skb,
-				__be32 id, u32 index, bool truncate);
+				__be32 id, u32 index,
+				bool truncate, bool is_ipv4);
 
 static unsigned int ipgre_net_id __read_mostly;
 static unsigned int gre_tap_net_id __read_mostly;
@@ -589,7 +590,7 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev,
 		goto err_free_rt;
 
 	erspan_build_header(skb, tunnel_id_to_key32(key->tun_id),
-			    ntohl(md->index), truncate);
+			    ntohl(md->index), truncate, true);
 
 	gre_build_header(skb, 8, TUNNEL_SEQ,
 			 htons(ETH_P_ERSPAN), 0, htonl(tunnel->o_seqno++));
@@ -668,52 +669,6 @@ free_skb:
 	return NETDEV_TX_OK;
 }
 
-static inline u8 tos_to_cos(u8 tos)
-{
-	u8 dscp, cos;
-
-	dscp = tos >> 2;
-	cos = dscp >> 3;
-	return cos;
-}
-
-static void erspan_build_header(struct sk_buff *skb,
-				__be32 id, u32 index, bool truncate)
-{
-	struct iphdr *iphdr = ip_hdr(skb);
-	struct ethhdr *eth = eth_hdr(skb);
-	enum erspan_encap_type enc_type;
-	struct erspanhdr *ershdr;
-	struct qtag_prefix {
-		__be16 eth_type;
-		__be16 tci;
-	} *qp;
-	u16 vlan_tci = 0;
-
-	enc_type = ERSPAN_ENCAP_NOVLAN;
-
-	/* If mirrored packet has vlan tag, extract tci and
-	 *  perserve vlan header in the mirrored frame.
-	 */
-	if (eth->h_proto == htons(ETH_P_8021Q)) {
-		qp = (struct qtag_prefix *)(skb->data + 2 * ETH_ALEN);
-		vlan_tci = ntohs(qp->tci);
-		enc_type = ERSPAN_ENCAP_INFRAME;
-	}
-
-	skb_push(skb, sizeof(*ershdr));
-	ershdr = (struct erspanhdr *)skb->data;
-	memset(ershdr, 0, sizeof(*ershdr));
-
-	ershdr->ver_vlan = htons((vlan_tci & VLAN_MASK) |
-				 (ERSPAN_VERSION << VER_OFFSET));
-	ershdr->session_id = htons((u16)(ntohl(id) & ID_MASK) |
-			   ((tos_to_cos(iphdr->tos) << COS_OFFSET) & COS_MASK) |
-			   (enc_type << EN_OFFSET & EN_MASK) |
-			   ((truncate << T_OFFSET) & T_MASK));
-	ershdr->md.index = htonl(index & INDEX_MASK);
-}
-
 static netdev_tx_t erspan_xmit(struct sk_buff *skb,
 			       struct net_device *dev)
 {
@@ -737,7 +692,8 @@ static netdev_tx_t erspan_xmit(struct sk_buff *skb,
 	}
 
 	/* Push ERSPAN header */
-	erspan_build_header(skb, tunnel->parms.o_key, tunnel->index, truncate);
+	erspan_build_header(skb, tunnel->parms.o_key, tunnel->index,
+			    truncate, true);
 	tunnel->parms.o_flags &= ~TUNNEL_KEY;
 	__gre_xmit(skb, dev, &tunnel->parms.iph, htons(ETH_P_ERSPAN));
 	return NETDEV_TX_OK;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 43b69af242e1..f0ed031f3594 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1106,7 +1106,7 @@ void ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu)
 		new = true;
 	}
 
-	__ip_rt_update_pmtu((struct rtable *) rt->dst.path, &fl4, mtu);
+	__ip_rt_update_pmtu((struct rtable *) xfrm_dst_path(&rt->dst), &fl4, mtu);
 
 	if (!dst_check(&rt->dst, 0)) {
 		if (new)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f08eebe60446..c470fec9062f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3578,6 +3578,9 @@ void __init tcp_init(void)
 	percpu_counter_init(&tcp_sockets_allocated, 0, GFP_KERNEL);
 	percpu_counter_init(&tcp_orphan_count, 0, GFP_KERNEL);
 	inet_hashinfo_init(&tcp_hashinfo);
+	inet_hashinfo2_init(&tcp_hashinfo, "tcp_listen_portaddr_hash",
+			    thash_entries, 21,  /* one slot per 2 MB*/
+			    0, 64 * 1024);
 	tcp_hashinfo.bind_bucket_cachep =
 		kmem_cache_create("tcp_bind_bucket",
 				  sizeof(struct inet_bind_bucket), 0,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e4ff25c947c5..e9c0d1e1772e 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -357,18 +357,12 @@ fail:
 }
 EXPORT_SYMBOL(udp_lib_get_port);
 
-static u32 udp4_portaddr_hash(const struct net *net, __be32 saddr,
-			      unsigned int port)
-{
-	return jhash_1word((__force u32)saddr, net_hash_mix(net)) ^ port;
-}
-
 int udp_v4_get_port(struct sock *sk, unsigned short snum)
 {
 	unsigned int hash2_nulladdr =
-		udp4_portaddr_hash(sock_net(sk), htonl(INADDR_ANY), snum);
+		ipv4_portaddr_hash(sock_net(sk), htonl(INADDR_ANY), snum);
 	unsigned int hash2_partial =
-		udp4_portaddr_hash(sock_net(sk), inet_sk(sk)->inet_rcv_saddr, 0);
+		ipv4_portaddr_hash(sock_net(sk), inet_sk(sk)->inet_rcv_saddr, 0);
 
 	/* precompute partial secondary hash */
 	udp_sk(sk)->udp_portaddr_hash = hash2_partial;
@@ -445,7 +439,7 @@ static struct sock *udp4_lib_lookup2(struct net *net,
 				     struct sk_buff *skb)
 {
 	struct sock *sk, *result;
-	int score, badness, matches = 0, reuseport = 0;
+	int score, badness;
 	u32 hash = 0;
 
 	result = NULL;
@@ -454,23 +448,16 @@ static struct sock *udp4_lib_lookup2(struct net *net,
 		score = compute_score(sk, net, saddr, sport,
 				      daddr, hnum, dif, sdif, exact_dif);
 		if (score > badness) {
-			reuseport = sk->sk_reuseport;
-			if (reuseport) {
+			if (sk->sk_reuseport) {
 				hash = udp_ehashfn(net, daddr, hnum,
 						   saddr, sport);
 				result = reuseport_select_sock(sk, hash, skb,
 							sizeof(struct udphdr));
 				if (result)
 					return result;
-				matches = 1;
 			}
 			badness = score;
 			result = sk;
-		} else if (score == badness && reuseport) {
-			matches++;
-			if (reciprocal_scale(hash, matches) == 0)
-				result = sk;
-			hash = next_pseudo_random32(hash);
 		}
 	}
 	return result;
@@ -488,11 +475,11 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr,
 	unsigned int hash2, slot2, slot = udp_hashfn(net, hnum, udptable->mask);
 	struct udp_hslot *hslot2, *hslot = &udptable->hash[slot];
 	bool exact_dif = udp_lib_exact_dif_match(net, skb);
-	int score, badness, matches = 0, reuseport = 0;
+	int score, badness;
 	u32 hash = 0;
 
 	if (hslot->count > 10) {
-		hash2 = udp4_portaddr_hash(net, daddr, hnum);
+		hash2 = ipv4_portaddr_hash(net, daddr, hnum);
 		slot2 = hash2 & udptable->mask;
 		hslot2 = &udptable->hash2[slot2];
 		if (hslot->count < hslot2->count)
@@ -503,7 +490,7 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr,
 					  exact_dif, hslot2, skb);
 		if (!result) {
 			unsigned int old_slot2 = slot2;
-			hash2 = udp4_portaddr_hash(net, htonl(INADDR_ANY), hnum);
+			hash2 = ipv4_portaddr_hash(net, htonl(INADDR_ANY), hnum);
 			slot2 = hash2 & udptable->mask;
 			/* avoid searching the same slot again. */
 			if (unlikely(slot2 == old_slot2))
@@ -526,23 +513,16 @@ begin:
 		score = compute_score(sk, net, saddr, sport,
 				      daddr, hnum, dif, sdif, exact_dif);
 		if (score > badness) {
-			reuseport = sk->sk_reuseport;
-			if (reuseport) {
+			if (sk->sk_reuseport) {
 				hash = udp_ehashfn(net, daddr, hnum,
 						   saddr, sport);
 				result = reuseport_select_sock(sk, hash, skb,
 							sizeof(struct udphdr));
 				if (result)
 					return result;
-				matches = 1;
 			}
 			result = sk;
 			badness = score;
-		} else if (score == badness && reuseport) {
-			matches++;
-			if (reciprocal_scale(hash, matches) == 0)
-				result = sk;
-			hash = next_pseudo_random32(hash);
 		}
 	}
 	return result;
@@ -1775,7 +1755,7 @@ EXPORT_SYMBOL(udp_lib_rehash);
 
 static void udp_v4_rehash(struct sock *sk)
 {
-	u16 new_hash = udp4_portaddr_hash(sock_net(sk),
+	u16 new_hash = ipv4_portaddr_hash(sock_net(sk),
 					  inet_sk(sk)->inet_rcv_saddr,
 					  inet_sk(sk)->inet_num);
 	udp_lib_rehash(sk, new_hash);
@@ -1966,9 +1946,9 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	struct sk_buff *nskb;
 
 	if (use_hash2) {
-		hash2_any = udp4_portaddr_hash(net, htonl(INADDR_ANY), hnum) &
+		hash2_any = ipv4_portaddr_hash(net, htonl(INADDR_ANY), hnum) &
 			    udptable->mask;
-		hash2 = udp4_portaddr_hash(net, daddr, hnum) & udptable->mask;
+		hash2 = ipv4_portaddr_hash(net, daddr, hnum) & udptable->mask;
 start_lookup:
 		hslot = &udptable->hash2[hash2];
 		offset = offsetof(typeof(*sk), __sk_common.skc_portaddr_node);
@@ -2200,7 +2180,7 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
 					    int dif, int sdif)
 {
 	unsigned short hnum = ntohs(loc_port);
-	unsigned int hash2 = udp4_portaddr_hash(net, loc_addr, hnum);
+	unsigned int hash2 = ipv4_portaddr_hash(net, loc_addr, hnum);
 	unsigned int slot2 = hash2 & udp_table.mask;
 	struct udp_hslot *hslot2 = &udp_table.hash2[slot2];
 	INET_ADDR_COOKIE(acookie, rmt_addr, loc_addr);
diff --git a/net/ipv4/xfrm4_mode_tunnel.c b/net/ipv4/xfrm4_mode_tunnel.c
index e6265e2c274e..7d885a44dc9d 100644
--- a/net/ipv4/xfrm4_mode_tunnel.c
+++ b/net/ipv4/xfrm4_mode_tunnel.c
@@ -62,7 +62,7 @@ static int xfrm4_mode_tunnel_output(struct xfrm_state *x, struct sk_buff *skb)
 	top_iph->frag_off = (flags & XFRM_STATE_NOPMTUDISC) ?
 		0 : (XFRM_MODE_SKB_CB(skb)->frag_off & htons(IP_DF));
 
-	top_iph->ttl = ip4_dst_hoplimit(dst->child);
+	top_iph->ttl = ip4_dst_hoplimit(xfrm_dst_child(dst));
 
 	top_iph->saddr = x->props.saddr.a4;
 	top_iph->daddr = x->id.daddr.a4;
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f49bd7897e95..ed06b1190f05 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6595,27 +6595,45 @@ int __init addrconf_init(void)
 
 	rtnl_af_register(&inet6_ops);
 
-	err = __rtnl_register(PF_INET6, RTM_GETLINK, NULL, inet6_dump_ifinfo,
-			      0);
+	err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETLINK,
+				   NULL, inet6_dump_ifinfo, 0);
 	if (err < 0)
 		goto errout;
 
-	/* Only the first call to __rtnl_register can fail */
-	__rtnl_register(PF_INET6, RTM_NEWADDR, inet6_rtm_newaddr, NULL, 0);
-	__rtnl_register(PF_INET6, RTM_DELADDR, inet6_rtm_deladdr, NULL, 0);
-	__rtnl_register(PF_INET6, RTM_GETADDR, inet6_rtm_getaddr,
-			inet6_dump_ifaddr, RTNL_FLAG_DOIT_UNLOCKED);
-	__rtnl_register(PF_INET6, RTM_GETMULTICAST, NULL,
-			inet6_dump_ifmcaddr, 0);
-	__rtnl_register(PF_INET6, RTM_GETANYCAST, NULL,
-			inet6_dump_ifacaddr, 0);
-	__rtnl_register(PF_INET6, RTM_GETNETCONF, inet6_netconf_get_devconf,
-			inet6_netconf_dump_devconf, RTNL_FLAG_DOIT_UNLOCKED);
-
-	ipv6_addr_label_rtnl_register();
+	err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_NEWADDR,
+				   inet6_rtm_newaddr, NULL, 0);
+	if (err < 0)
+		goto errout;
+	err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_DELADDR,
+				   inet6_rtm_deladdr, NULL, 0);
+	if (err < 0)
+		goto errout;
+	err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETADDR,
+				   inet6_rtm_getaddr, inet6_dump_ifaddr,
+				   RTNL_FLAG_DOIT_UNLOCKED);
+	if (err < 0)
+		goto errout;
+	err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETMULTICAST,
+				   NULL, inet6_dump_ifmcaddr, 0);
+	if (err < 0)
+		goto errout;
+	err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETANYCAST,
+				   NULL, inet6_dump_ifacaddr, 0);
+	if (err < 0)
+		goto errout;
+	err = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETNETCONF,
+				   inet6_netconf_get_devconf,
+				   inet6_netconf_dump_devconf,
+				   RTNL_FLAG_DOIT_UNLOCKED);
+	if (err < 0)
+		goto errout;
+	err = ipv6_addr_label_rtnl_register();
+	if (err < 0)
+		goto errout;
 
 	return 0;
 errout:
+	rtnl_unregister_all(PF_INET6);
 	rtnl_af_unregister(&inet6_ops);
 	unregister_netdevice_notifier(&ipv6_dev_notf);
 errlo:
diff --git a/net/ipv6/addrlabel.c b/net/ipv6/addrlabel.c
index 00e1f8ee08f8..1d6ced37ad71 100644
--- a/net/ipv6/addrlabel.c
+++ b/net/ipv6/addrlabel.c
@@ -547,13 +547,22 @@ static int ip6addrlbl_get(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 	return err;
 }
 
-void __init ipv6_addr_label_rtnl_register(void)
+int __init ipv6_addr_label_rtnl_register(void)
 {
-	__rtnl_register(PF_INET6, RTM_NEWADDRLABEL, ip6addrlbl_newdel,
-			NULL, RTNL_FLAG_DOIT_UNLOCKED);
-	__rtnl_register(PF_INET6, RTM_DELADDRLABEL, ip6addrlbl_newdel,
-			NULL, RTNL_FLAG_DOIT_UNLOCKED);
-	__rtnl_register(PF_INET6, RTM_GETADDRLABEL, ip6addrlbl_get,
-			ip6addrlbl_dump, RTNL_FLAG_DOIT_UNLOCKED);
-}
+	int ret;
 
+	ret = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_NEWADDRLABEL,
+				   ip6addrlbl_newdel,
+				   NULL, RTNL_FLAG_DOIT_UNLOCKED);
+	if (ret < 0)
+		return ret;
+	ret = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_DELADDRLABEL,
+				   ip6addrlbl_newdel,
+				   NULL, RTNL_FLAG_DOIT_UNLOCKED);
+	if (ret < 0)
+		return ret;
+	ret = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETADDRLABEL,
+				   ip6addrlbl_get,
+				   ip6addrlbl_dump, RTNL_FLAG_DOIT_UNLOCKED);
+	return ret;
+}
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index b01858f5deb1..2febe26de6a1 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -125,6 +125,40 @@ static inline int compute_score(struct sock *sk, struct net *net,
 }
 
 /* called with rcu_read_lock() */
+static struct sock *inet6_lhash2_lookup(struct net *net,
+		struct inet_listen_hashbucket *ilb2,
+		struct sk_buff *skb, int doff,
+		const struct in6_addr *saddr,
+		const __be16 sport, const struct in6_addr *daddr,
+		const unsigned short hnum, const int dif, const int sdif)
+{
+	bool exact_dif = inet6_exact_dif_match(net, skb);
+	struct inet_connection_sock *icsk;
+	struct sock *sk, *result = NULL;
+	int score, hiscore = 0;
+	u32 phash = 0;
+
+	inet_lhash2_for_each_icsk_rcu(icsk, &ilb2->head) {
+		sk = (struct sock *)icsk;
+		score = compute_score(sk, net, hnum, daddr, dif, sdif,
+				      exact_dif);
+		if (score > hiscore) {
+			if (sk->sk_reuseport) {
+				phash = inet6_ehashfn(net, daddr, hnum,
+						      saddr, sport);
+				result = reuseport_select_sock(sk, phash,
+							       skb, doff);
+				if (result)
+					return result;
+			}
+			result = sk;
+			hiscore = score;
+		}
+	}
+
+	return result;
+}
+
 struct sock *inet6_lookup_listener(struct net *net,
 		struct inet_hashinfo *hashinfo,
 		struct sk_buff *skb, int doff,
@@ -134,31 +168,56 @@ struct sock *inet6_lookup_listener(struct net *net,
 {
 	unsigned int hash = inet_lhashfn(net, hnum);
 	struct inet_listen_hashbucket *ilb = &hashinfo->listening_hash[hash];
-	int score, hiscore = 0, matches = 0, reuseport = 0;
 	bool exact_dif = inet6_exact_dif_match(net, skb);
+	struct inet_listen_hashbucket *ilb2;
 	struct sock *sk, *result = NULL;
+	int score, hiscore = 0;
+	unsigned int hash2;
 	u32 phash = 0;
 
+	if (ilb->count <= 10 || !hashinfo->lhash2)
+		goto port_lookup;
+
+	/* Too many sk in the ilb bucket (which is hashed by port alone).
+	 * Try lhash2 (which is hashed by port and addr) instead.
+	 */
+
+	hash2 = ipv6_portaddr_hash(net, daddr, hnum);
+	ilb2 = inet_lhash2_bucket(hashinfo, hash2);
+	if (ilb2->count > ilb->count)
+		goto port_lookup;
+
+	result = inet6_lhash2_lookup(net, ilb2, skb, doff,
+				     saddr, sport, daddr, hnum,
+				     dif, sdif);
+	if (result)
+		return result;
+
+	/* Lookup lhash2 with in6addr_any */
+
+	hash2 = ipv6_portaddr_hash(net, &in6addr_any, hnum);
+	ilb2 = inet_lhash2_bucket(hashinfo, hash2);
+	if (ilb2->count > ilb->count)
+		goto port_lookup;
+
+	return inet6_lhash2_lookup(net, ilb2, skb, doff,
+				   saddr, sport, daddr, hnum,
+				   dif, sdif);
+
+port_lookup:
 	sk_for_each(sk, &ilb->head) {
 		score = compute_score(sk, net, hnum, daddr, dif, sdif, exact_dif);
 		if (score > hiscore) {
-			reuseport = sk->sk_reuseport;
-			if (reuseport) {
+			if (sk->sk_reuseport) {
 				phash = inet6_ehashfn(net, daddr, hnum,
 						      saddr, sport);
 				result = reuseport_select_sock(sk, phash,
 							       skb, doff);
 				if (result)
 					return result;
-				matches = 1;
 			}
 			result = sk;
 			hiscore = score;
-		} else if (score == hiscore && reuseport) {
-			matches++;
-			if (reciprocal_scale(phash, matches) == 0)
-				result = sk;
-			phash = next_pseudo_random32(phash);
 		}
 	}
 	return result;
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index f5285f4e1d08..a64d559fa513 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -893,7 +893,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
 	ins = &fn->leaf;
 
 	for (iter = leaf; iter;
-	     iter = rcu_dereference_protected(iter->dst.rt6_next,
+	     iter = rcu_dereference_protected(iter->rt6_next,
 				lockdep_is_held(&rt->rt6i_table->tb6_lock))) {
 		/*
 		 *	Search for duplicates
@@ -950,7 +950,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
 			break;
 
 next_iter:
-		ins = &iter->dst.rt6_next;
+		ins = &iter->rt6_next;
 	}
 
 	if (fallback_ins && !found) {
@@ -979,7 +979,7 @@ next_iter:
 					      &sibling->rt6i_siblings);
 				break;
 			}
-			sibling = rcu_dereference_protected(sibling->dst.rt6_next,
+			sibling = rcu_dereference_protected(sibling->rt6_next,
 				    lockdep_is_held(&rt->rt6i_table->tb6_lock));
 		}
 		/* For each sibling in the list, increment the counter of
@@ -1009,7 +1009,7 @@ add:
 		if (err)
 			return err;
 
-		rcu_assign_pointer(rt->dst.rt6_next, iter);
+		rcu_assign_pointer(rt->rt6_next, iter);
 		atomic_inc(&rt->rt6i_ref);
 		rcu_assign_pointer(rt->rt6i_node, fn);
 		rcu_assign_pointer(*ins, rt);
@@ -1040,7 +1040,7 @@ add:
 
 		atomic_inc(&rt->rt6i_ref);
 		rcu_assign_pointer(rt->rt6i_node, fn);
-		rt->dst.rt6_next = iter->dst.rt6_next;
+		rt->rt6_next = iter->rt6_next;
 		rcu_assign_pointer(*ins, rt);
 		call_fib6_entry_notifiers(info->nl_net, FIB_EVENT_ENTRY_REPLACE,
 					  rt, extack);
@@ -1059,14 +1059,14 @@ add:
 
 		if (nsiblings) {
 			/* Replacing an ECMP route, remove all siblings */
-			ins = &rt->dst.rt6_next;
+			ins = &rt->rt6_next;
 			iter = rcu_dereference_protected(*ins,
 				    lockdep_is_held(&rt->rt6i_table->tb6_lock));
 			while (iter) {
 				if (iter->rt6i_metric > rt->rt6i_metric)
 					break;
 				if (rt6_qualify_for_ecmp(iter)) {
-					*ins = iter->dst.rt6_next;
+					*ins = iter->rt6_next;
 					iter->rt6i_node = NULL;
 					fib6_purge_rt(iter, fn, info->nl_net);
 					if (rcu_access_pointer(fn->rr_ptr) == iter)
@@ -1075,7 +1075,7 @@ add:
 					nsiblings--;
 					info->nl_net->ipv6.rt6_stats->fib_rt_entries--;
 				} else {
-					ins = &iter->dst.rt6_next;
+					ins = &iter->rt6_next;
 				}
 				iter = rcu_dereference_protected(*ins,
 					lockdep_is_held(&rt->rt6i_table->tb6_lock));
@@ -1644,7 +1644,7 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
 	WARN_ON_ONCE(rt->rt6i_flags & RTF_CACHE);
 
 	/* Unlink it */
-	*rtp = rt->dst.rt6_next;
+	*rtp = rt->rt6_next;
 	rt->rt6i_node = NULL;
 	net->ipv6.rt6_stats->fib_rt_entries--;
 	net->ipv6.rt6_stats->fib_discarded_routes++;
@@ -1672,7 +1672,7 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
 	FOR_WALKERS(net, w) {
 		if (w->state == FWS_C && w->leaf == rt) {
 			RT6_TRACE("walker %p adjusted by delroute\n", w);
-			w->leaf = rcu_dereference_protected(rt->dst.rt6_next,
+			w->leaf = rcu_dereference_protected(rt->rt6_next,
 					    lockdep_is_held(&table->tb6_lock));
 			if (!w->leaf)
 				w->state = FWS_U;
@@ -1731,7 +1731,7 @@ int fib6_del(struct rt6_info *rt, struct nl_info *info)
 			fib6_del_route(table, fn, rtp, info);
 			return 0;
 		}
-		rtp_next = &cur->dst.rt6_next;
+		rtp_next = &cur->rt6_next;
 	}
 	return -ENOENT;
 }
@@ -2142,8 +2142,8 @@ int __init fib6_init(void)
 	if (ret)
 		goto out_kmem_cache_create;
 
-	ret = __rtnl_register(PF_INET6, RTM_GETROUTE, NULL, inet6_dump_fib,
-			      0);
+	ret = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETROUTE, NULL,
+				   inet6_dump_fib, 0);
 	if (ret)
 		goto out_unregister_subsys;
 
@@ -2208,7 +2208,7 @@ static int ipv6_route_yield(struct fib6_walker *w)
 
 	do {
 		iter->w.leaf = rcu_dereference_protected(
-				iter->w.leaf->dst.rt6_next,
+				iter->w.leaf->rt6_next,
 				lockdep_is_held(&iter->tbl->tb6_lock));
 		iter->skip--;
 		if (!iter->skip && iter->w.leaf)
@@ -2274,7 +2274,7 @@ static void *ipv6_route_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 	if (!v)
 		goto iter_table;
 
-	n = rcu_dereference_bh(((struct rt6_info *)v)->dst.rt6_next);
+	n = rcu_dereference_bh(((struct rt6_info *)v)->rt6_next);
 	if (n) {
 		++*pos;
 		return n;
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 4cfd8e0696fe..4562579797d1 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -55,6 +55,8 @@
 #include <net/ip6_route.h>
 #include <net/ip6_tunnel.h>
 #include <net/gre.h>
+#include <net/erspan.h>
+#include <net/dst_metadata.h>
 
 
 static bool log_ecn_error = true;
@@ -68,11 +70,13 @@ static unsigned int ip6gre_net_id __read_mostly;
 struct ip6gre_net {
 	struct ip6_tnl __rcu *tunnels[4][IP6_GRE_HASH_SIZE];
 
+	struct ip6_tnl __rcu *collect_md_tun;
 	struct net_device *fb_tunnel_dev;
 };
 
 static struct rtnl_link_ops ip6gre_link_ops __read_mostly;
 static struct rtnl_link_ops ip6gre_tap_ops __read_mostly;
+static struct rtnl_link_ops ip6erspan_tap_ops __read_mostly;
 static int ip6gre_tunnel_init(struct net_device *dev);
 static void ip6gre_tunnel_setup(struct net_device *dev);
 static void ip6gre_tunnel_link(struct ip6gre_net *ign, struct ip6_tnl *t);
@@ -121,7 +125,8 @@ static struct ip6_tnl *ip6gre_tunnel_lookup(struct net_device *dev,
 	unsigned int h1 = HASH_KEY(key);
 	struct ip6_tnl *t, *cand = NULL;
 	struct ip6gre_net *ign = net_generic(net, ip6gre_net_id);
-	int dev_type = (gre_proto == htons(ETH_P_TEB)) ?
+	int dev_type = (gre_proto == htons(ETH_P_TEB) ||
+			gre_proto == htons(ETH_P_ERSPAN)) ?
 		       ARPHRD_ETHER : ARPHRD_IP6GRE;
 	int score, cand_score = 4;
 
@@ -226,6 +231,10 @@ static struct ip6_tnl *ip6gre_tunnel_lookup(struct net_device *dev,
 	if (cand)
 		return cand;
 
+	t = rcu_dereference(ign->collect_md_tun);
+	if (t && t->dev->flags & IFF_UP)
+		return t;
+
 	dev = ign->fb_tunnel_dev;
 	if (dev->flags & IFF_UP)
 		return netdev_priv(dev);
@@ -261,6 +270,9 @@ static void ip6gre_tunnel_link(struct ip6gre_net *ign, struct ip6_tnl *t)
 {
 	struct ip6_tnl __rcu **tp = ip6gre_bucket(ign, t);
 
+	if (t->parms.collect_md)
+		rcu_assign_pointer(ign->collect_md_tun, t);
+
 	rcu_assign_pointer(t->next, rtnl_dereference(*tp));
 	rcu_assign_pointer(*tp, t);
 }
@@ -270,6 +282,9 @@ static void ip6gre_tunnel_unlink(struct ip6gre_net *ign, struct ip6_tnl *t)
 	struct ip6_tnl __rcu **tp;
 	struct ip6_tnl *iter;
 
+	if (t->parms.collect_md)
+		rcu_assign_pointer(ign->collect_md_tun, NULL);
+
 	for (tp = ip6gre_bucket(ign, t);
 	     (iter = rtnl_dereference(*tp)) != NULL;
 	     tp = &iter->next) {
@@ -460,7 +475,86 @@ static int ip6gre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi)
 				      &ipv6h->saddr, &ipv6h->daddr, tpi->key,
 				      tpi->proto);
 	if (tunnel) {
-		ip6_tnl_rcv(tunnel, skb, tpi, NULL, log_ecn_error);
+		if (tunnel->parms.collect_md) {
+			struct metadata_dst *tun_dst;
+			__be64 tun_id;
+			__be16 flags;
+
+			flags = tpi->flags;
+			tun_id = key32_to_tunnel_id(tpi->key);
+
+			tun_dst = ipv6_tun_rx_dst(skb, flags, tun_id, 0);
+			if (!tun_dst)
+				return PACKET_REJECT;
+
+			ip6_tnl_rcv(tunnel, skb, tpi, tun_dst, log_ecn_error);
+		} else {
+			ip6_tnl_rcv(tunnel, skb, tpi, NULL, log_ecn_error);
+		}
+
+		return PACKET_RCVD;
+	}
+
+	return PACKET_REJECT;
+}
+
+static int ip6erspan_rcv(struct sk_buff *skb, int gre_hdr_len,
+			 struct tnl_ptk_info *tpi)
+{
+	const struct ipv6hdr *ipv6h;
+	struct erspanhdr *ershdr;
+	struct ip6_tnl *tunnel;
+	__be32 index;
+
+	ipv6h = ipv6_hdr(skb);
+	ershdr = (struct erspanhdr *)skb->data;
+
+	if (unlikely(!pskb_may_pull(skb, sizeof(*ershdr))))
+		return PACKET_REJECT;
+
+	tpi->key = cpu_to_be32(ntohs(ershdr->session_id) & ID_MASK);
+	index = ershdr->md.index;
+
+	tunnel = ip6gre_tunnel_lookup(skb->dev,
+				      &ipv6h->saddr, &ipv6h->daddr, tpi->key,
+				      tpi->proto);
+	if (tunnel) {
+		if (__iptunnel_pull_header(skb, sizeof(*ershdr),
+					   htons(ETH_P_TEB),
+					   false, false) < 0)
+			return PACKET_REJECT;
+
+		if (tunnel->parms.collect_md) {
+			struct metadata_dst *tun_dst;
+			struct ip_tunnel_info *info;
+			struct erspan_metadata *md;
+			__be64 tun_id;
+			__be16 flags;
+
+			tpi->flags |= TUNNEL_KEY;
+			flags = tpi->flags;
+			tun_id = key32_to_tunnel_id(tpi->key);
+
+			tun_dst = ipv6_tun_rx_dst(skb, flags, tun_id,
+						  sizeof(*md));
+			if (!tun_dst)
+				return PACKET_REJECT;
+
+			info = &tun_dst->u.tun_info;
+			md = ip_tunnel_info_opts(info);
+			if (!md)
+				return PACKET_REJECT;
+
+			md->index = index;
+			info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+			info->options_len = sizeof(*md);
+
+			ip6_tnl_rcv(tunnel, skb, tpi, tun_dst, log_ecn_error);
+
+		} else {
+			tunnel->parms.index = ntohl(index);
+			ip6_tnl_rcv(tunnel, skb, tpi, NULL, log_ecn_error);
+		}
 
 		return PACKET_RCVD;
 	}
@@ -481,6 +575,12 @@ static int gre_rcv(struct sk_buff *skb)
 	if (iptunnel_pull_header(skb, hdr_len, tpi.proto, false))
 		goto drop;
 
+	if (unlikely(tpi.proto == htons(ETH_P_ERSPAN))) {
+		if (ip6erspan_rcv(skb, hdr_len, &tpi) == PACKET_RCVD)
+			return 0;
+		goto drop;
+	}
+
 	if (ip6gre_rcv(skb, &tpi) == PACKET_RCVD)
 		return 0;
 
@@ -496,6 +596,78 @@ static int gre_handle_offloads(struct sk_buff *skb, bool csum)
 					csum ? SKB_GSO_GRE_CSUM : SKB_GSO_GRE);
 }
 
+static void prepare_ip6gre_xmit_ipv4(struct sk_buff *skb,
+				     struct net_device *dev,
+				     struct flowi6 *fl6, __u8 *dsfield,
+				     int *encap_limit)
+{
+	const struct iphdr *iph = ip_hdr(skb);
+	struct ip6_tnl *t = netdev_priv(dev);
+
+	if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
+		*encap_limit = t->parms.encap_limit;
+
+	memcpy(fl6, &t->fl.u.ip6, sizeof(*fl6));
+
+	if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
+		*dsfield = ipv4_get_dsfield(iph);
+	else
+		*dsfield = ip6_tclass(t->parms.flowinfo);
+
+	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
+		fl6->flowi6_mark = skb->mark;
+	else
+		fl6->flowi6_mark = t->parms.fwmark;
+
+	fl6->flowi6_uid = sock_net_uid(dev_net(dev), NULL);
+}
+
+static int prepare_ip6gre_xmit_ipv6(struct sk_buff *skb,
+				    struct net_device *dev,
+				    struct flowi6 *fl6, __u8 *dsfield,
+				    int *encap_limit)
+{
+	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+	struct ip6_tnl *t = netdev_priv(dev);
+	__u16 offset;
+
+	offset = ip6_tnl_parse_tlv_enc_lim(skb, skb_network_header(skb));
+	/* ip6_tnl_parse_tlv_enc_lim() might have reallocated skb->head */
+
+	if (offset > 0) {
+		struct ipv6_tlv_tnl_enc_lim *tel;
+
+		tel = (struct ipv6_tlv_tnl_enc_lim *)&skb_network_header(skb)[offset];
+		if (tel->encap_limit == 0) {
+			icmpv6_send(skb, ICMPV6_PARAMPROB,
+				    ICMPV6_HDR_FIELD, offset + 2);
+			return -1;
+		}
+		*encap_limit = tel->encap_limit - 1;
+	} else if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT)) {
+		*encap_limit = t->parms.encap_limit;
+	}
+
+	memcpy(fl6, &t->fl.u.ip6, sizeof(*fl6));
+
+	if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
+		*dsfield = ipv6_get_dsfield(ipv6h);
+	else
+		*dsfield = ip6_tclass(t->parms.flowinfo);
+
+	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FLOWLABEL)
+		fl6->flowlabel |= ip6_flowlabel(ipv6h);
+
+	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
+		fl6->flowi6_mark = skb->mark;
+	else
+		fl6->flowi6_mark = t->parms.fwmark;
+
+	fl6->flowi6_uid = sock_net_uid(dev_net(dev), NULL);
+
+	return 0;
+}
+
 static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
 			       struct net_device *dev, __u8 dsfield,
 			       struct flowi6 *fl6, int encap_limit,
@@ -517,8 +689,38 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
 
 	/* Push GRE header. */
 	protocol = (dev->type == ARPHRD_ETHER) ? htons(ETH_P_TEB) : proto;
-	gre_build_header(skb, tunnel->tun_hlen, tunnel->parms.o_flags,
-			 protocol, tunnel->parms.o_key, htonl(tunnel->o_seqno));
+
+	if (tunnel->parms.collect_md) {
+		struct ip_tunnel_info *tun_info;
+		const struct ip_tunnel_key *key;
+		__be16 flags;
+
+		tun_info = skb_tunnel_info(skb);
+		if (unlikely(!tun_info ||
+			     !(tun_info->mode & IP_TUNNEL_INFO_TX) ||
+			     ip_tunnel_info_af(tun_info) != AF_INET6))
+			return -EINVAL;
+
+		key = &tun_info->key;
+		memset(fl6, 0, sizeof(*fl6));
+		fl6->flowi6_proto = IPPROTO_GRE;
+		fl6->daddr = key->u.ipv6.dst;
+		fl6->flowlabel = key->label;
+		fl6->flowi6_uid = sock_net_uid(dev_net(dev), NULL);
+
+		dsfield = key->tos;
+		flags = key->tun_flags & (TUNNEL_CSUM | TUNNEL_KEY);
+		tunnel->tun_hlen = gre_calc_hlen(flags);
+
+		gre_build_header(skb, tunnel->tun_hlen,
+				 flags, protocol,
+				 tunnel_id_to_key32(tun_info->key.tun_id), 0);
+
+	} else {
+		gre_build_header(skb, tunnel->tun_hlen, tunnel->parms.o_flags,
+				 protocol, tunnel->parms.o_key,
+				 htonl(tunnel->o_seqno));
+	}
 
 	return ip6_tnl_xmit(skb, dev, dsfield, fl6, encap_limit, pmtu,
 			    NEXTHDR_GRE);
@@ -527,30 +729,17 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
 static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, struct net_device *dev)
 {
 	struct ip6_tnl *t = netdev_priv(dev);
-	const struct iphdr  *iph = ip_hdr(skb);
 	int encap_limit = -1;
 	struct flowi6 fl6;
-	__u8 dsfield;
+	__u8 dsfield = 0;
 	__u32 mtu;
 	int err;
 
 	memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
 
-	if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
-		encap_limit = t->parms.encap_limit;
-
-	memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
-
-	if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-		dsfield = ipv4_get_dsfield(iph);
-	else
-		dsfield = ip6_tclass(t->parms.flowinfo);
-	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
-		fl6.flowi6_mark = skb->mark;
-	else
-		fl6.flowi6_mark = t->parms.fwmark;
-
-	fl6.flowi6_uid = sock_net_uid(dev_net(dev), NULL);
+	if (!t->parms.collect_md)
+		prepare_ip6gre_xmit_ipv4(skb, dev, &fl6,
+					 &dsfield, &encap_limit);
 
 	err = gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM));
 	if (err)
@@ -574,46 +763,17 @@ static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, struct net_device *dev)
 	struct ip6_tnl *t = netdev_priv(dev);
 	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
 	int encap_limit = -1;
-	__u16 offset;
 	struct flowi6 fl6;
-	__u8 dsfield;
+	__u8 dsfield = 0;
 	__u32 mtu;
 	int err;
 
 	if (ipv6_addr_equal(&t->parms.raddr, &ipv6h->saddr))
 		return -1;
 
-	offset = ip6_tnl_parse_tlv_enc_lim(skb, skb_network_header(skb));
-	/* ip6_tnl_parse_tlv_enc_lim() might have reallocated skb->head */
-	ipv6h = ipv6_hdr(skb);
-
-	if (offset > 0) {
-		struct ipv6_tlv_tnl_enc_lim *tel;
-		tel = (struct ipv6_tlv_tnl_enc_lim *)&skb_network_header(skb)[offset];
-		if (tel->encap_limit == 0) {
-			icmpv6_send(skb, ICMPV6_PARAMPROB,
-				    ICMPV6_HDR_FIELD, offset + 2);
-			return -1;
-		}
-		encap_limit = tel->encap_limit - 1;
-	} else if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
-		encap_limit = t->parms.encap_limit;
-
-	memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
-
-	if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS)
-		dsfield = ipv6_get_dsfield(ipv6h);
-	else
-		dsfield = ip6_tclass(t->parms.flowinfo);
-
-	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FLOWLABEL)
-		fl6.flowlabel |= ip6_flowlabel(ipv6h);
-	if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK)
-		fl6.flowi6_mark = skb->mark;
-	else
-		fl6.flowi6_mark = t->parms.fwmark;
-
-	fl6.flowi6_uid = sock_net_uid(dev_net(dev), NULL);
+	if (!t->parms.collect_md &&
+	    prepare_ip6gre_xmit_ipv6(skb, dev, &fl6, &dsfield, &encap_limit))
+		return -1;
 
 	if (gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM)))
 		return -1;
@@ -660,7 +820,8 @@ static int ip6gre_xmit_other(struct sk_buff *skb, struct net_device *dev)
 	if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
 		encap_limit = t->parms.encap_limit;
 
-	memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
+	if (!t->parms.collect_md)
+		memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
 
 	err = gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM));
 	if (err)
@@ -705,6 +866,119 @@ tx_err:
 	return NETDEV_TX_OK;
 }
 
+static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
+					 struct net_device *dev)
+{
+	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+	struct ip6_tnl *t = netdev_priv(dev);
+	struct dst_entry *dst = skb_dst(skb);
+	struct net_device_stats *stats;
+	bool truncate = false;
+	int encap_limit = -1;
+	__u8 dsfield = false;
+	struct flowi6 fl6;
+	int err = -EINVAL;
+	__u32 mtu;
+
+	if (!ip6_tnl_xmit_ctl(t, &t->parms.laddr, &t->parms.raddr))
+		goto tx_err;
+
+	if (gre_handle_offloads(skb, false))
+		goto tx_err;
+
+	if (skb->len > dev->mtu + dev->hard_header_len) {
+		pskb_trim(skb, dev->mtu + dev->hard_header_len);
+		truncate = true;
+	}
+
+	t->parms.o_flags &= ~TUNNEL_KEY;
+	IPCB(skb)->flags = 0;
+
+	/* For collect_md mode, derive fl6 from the tunnel key,
+	 * for native mode, call prepare_ip6gre_xmit_{ipv4,ipv6}.
+	 */
+	if (t->parms.collect_md) {
+		struct ip_tunnel_info *tun_info;
+		const struct ip_tunnel_key *key;
+		struct erspan_metadata *md;
+
+		tun_info = skb_tunnel_info(skb);
+		if (unlikely(!tun_info ||
+			     !(tun_info->mode & IP_TUNNEL_INFO_TX) ||
+			     ip_tunnel_info_af(tun_info) != AF_INET6))
+			return -EINVAL;
+
+		key = &tun_info->key;
+		memset(&fl6, 0, sizeof(fl6));
+		fl6.flowi6_proto = IPPROTO_GRE;
+		fl6.daddr = key->u.ipv6.dst;
+		fl6.flowlabel = key->label;
+		fl6.flowi6_uid = sock_net_uid(dev_net(dev), NULL);
+
+		dsfield = key->tos;
+		md = ip_tunnel_info_opts(tun_info);
+		if (!md)
+			goto tx_err;
+
+		erspan_build_header(skb, tunnel_id_to_key32(key->tun_id),
+				    ntohl(md->index), truncate, false);
+
+	} else {
+		switch (skb->protocol) {
+		case htons(ETH_P_IP):
+			memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
+			prepare_ip6gre_xmit_ipv4(skb, dev, &fl6,
+						 &dsfield, &encap_limit);
+			break;
+		case htons(ETH_P_IPV6):
+			if (ipv6_addr_equal(&t->parms.raddr, &ipv6h->saddr))
+				goto tx_err;
+			if (prepare_ip6gre_xmit_ipv6(skb, dev, &fl6,
+						     &dsfield, &encap_limit))
+				goto tx_err;
+			break;
+		default:
+			memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
+			break;
+		}
+
+		erspan_build_header(skb, t->parms.o_key, t->parms.index,
+				    truncate, false);
+		fl6.daddr = t->parms.raddr;
+	}
+
+	/* Push GRE header. */
+	gre_build_header(skb, 8, TUNNEL_SEQ,
+			 htons(ETH_P_ERSPAN), 0, htonl(t->o_seqno++));
+
+	/* TooBig packet may have updated dst->dev's mtu */
+	if (!t->parms.collect_md && dst && dst_mtu(dst) > dst->dev->mtu)
+		dst->ops->update_pmtu(dst, NULL, skb, dst->dev->mtu);
+
+	err = ip6_tnl_xmit(skb, dev, dsfield, &fl6, encap_limit, &mtu,
+			   NEXTHDR_GRE);
+	if (err != 0) {
+		/* XXX: send ICMP error even if DF is not set. */
+		if (err == -EMSGSIZE) {
+			if (skb->protocol == htons(ETH_P_IP))
+				icmp_send(skb, ICMP_DEST_UNREACH,
+					  ICMP_FRAG_NEEDED, htonl(mtu));
+			else
+				icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+		}
+
+		goto tx_err;
+	}
+	return NETDEV_TX_OK;
+
+tx_err:
+	stats = &t->dev->stats;
+	stats->tx_errors++;
+	stats->tx_dropped++;
+	kfree_skb(skb);
+	return NETDEV_TX_OK;
+}
+
 static void ip6gre_tnl_link_config(struct ip6_tnl *t, int set_mtu)
 {
 	struct net_device *dev = t->dev;
@@ -1048,6 +1322,11 @@ static int ip6gre_tunnel_init_common(struct net_device *dev)
 	if (!(tunnel->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
 		dev->mtu -= 8;
 
+	if (tunnel->parms.collect_md) {
+		dev->features |= NETIF_F_NETNS_LOCAL;
+		netif_keep_dst(dev);
+	}
+
 	return 0;
 }
 
@@ -1062,6 +1341,9 @@ static int ip6gre_tunnel_init(struct net_device *dev)
 
 	tunnel = netdev_priv(dev);
 
+	if (tunnel->parms.collect_md)
+		return 0;
+
 	memcpy(dev->dev_addr, &tunnel->parms.laddr, sizeof(struct in6_addr));
 	memcpy(dev->broadcast, &tunnel->parms.raddr, sizeof(struct in6_addr));
 
@@ -1084,7 +1366,6 @@ static void ip6gre_fb_tunnel_init(struct net_device *dev)
 	dev_hold(dev);
 }
 
-
 static struct inet6_protocol ip6gre_protocol __read_mostly = {
 	.handler     = gre_rcv,
 	.err_handler = ip6gre_err,
@@ -1099,7 +1380,8 @@ static void ip6gre_destroy_tunnels(struct net *net, struct list_head *head)
 
 	for_each_netdev_safe(net, dev, aux)
 		if (dev->rtnl_link_ops == &ip6gre_link_ops ||
-		    dev->rtnl_link_ops == &ip6gre_tap_ops)
+		    dev->rtnl_link_ops == &ip6gre_tap_ops ||
+		    dev->rtnl_link_ops == &ip6erspan_tap_ops)
 			unregister_netdevice_queue(dev, head);
 
 	for (prio = 0; prio < 4; prio++) {
@@ -1221,6 +1503,47 @@ out:
 	return ip6gre_tunnel_validate(tb, data, extack);
 }
 
+static int ip6erspan_tap_validate(struct nlattr *tb[], struct nlattr *data[],
+				  struct netlink_ext_ack *extack)
+{
+	__be16 flags = 0;
+	int ret;
+
+	if (!data)
+		return 0;
+
+	ret = ip6gre_tap_validate(tb, data, extack);
+	if (ret)
+		return ret;
+
+	/* ERSPAN should only have GRE sequence and key flag */
+	if (data[IFLA_GRE_OFLAGS])
+		flags |= nla_get_be16(data[IFLA_GRE_OFLAGS]);
+	if (data[IFLA_GRE_IFLAGS])
+		flags |= nla_get_be16(data[IFLA_GRE_IFLAGS]);
+	if (!data[IFLA_GRE_COLLECT_METADATA] &&
+	    flags != (GRE_SEQ | GRE_KEY))
+		return -EINVAL;
+
+	/* ERSPAN Session ID only has 10-bit. Since we reuse
+	 * 32-bit key field as ID, check it's range.
+	 */
+	if (data[IFLA_GRE_IKEY] &&
+	    (ntohl(nla_get_be32(data[IFLA_GRE_IKEY])) & ~ID_MASK))
+		return -EINVAL;
+
+	if (data[IFLA_GRE_OKEY] &&
+	    (ntohl(nla_get_be32(data[IFLA_GRE_OKEY])) & ~ID_MASK))
+		return -EINVAL;
+
+	if (data[IFLA_GRE_ERSPAN_INDEX]) {
+		u32 index = nla_get_u32(data[IFLA_GRE_ERSPAN_INDEX]);
+
+		if (index & ~INDEX_MASK)
+			return -EINVAL;
+	}
+	return 0;
+}
 
 static void ip6gre_netlink_parms(struct nlattr *data[],
 				struct __ip6_tnl_parm *parms)
@@ -1267,6 +1590,12 @@ static void ip6gre_netlink_parms(struct nlattr *data[],
 
 	if (data[IFLA_GRE_FWMARK])
 		parms->fwmark = nla_get_u32(data[IFLA_GRE_FWMARK]);
+
+	if (data[IFLA_GRE_ERSPAN_INDEX])
+		parms->index = nla_get_u32(data[IFLA_GRE_ERSPAN_INDEX]);
+
+	if (data[IFLA_GRE_COLLECT_METADATA])
+		parms->collect_md = true;
 }
 
 static int ip6gre_tap_init(struct net_device *dev)
@@ -1303,6 +1632,59 @@ static const struct net_device_ops ip6gre_tap_netdev_ops = {
 		       NETIF_F_HIGHDMA |		\
 		       NETIF_F_HW_CSUM)
 
+static int ip6erspan_tap_init(struct net_device *dev)
+{
+	struct ip6_tnl *tunnel;
+	int t_hlen;
+	int ret;
+
+	tunnel = netdev_priv(dev);
+
+	tunnel->dev = dev;
+	tunnel->net = dev_net(dev);
+	strcpy(tunnel->parms.name, dev->name);
+
+	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
+	if (!dev->tstats)
+		return -ENOMEM;
+
+	ret = dst_cache_init(&tunnel->dst_cache, GFP_KERNEL);
+	if (ret) {
+		free_percpu(dev->tstats);
+		dev->tstats = NULL;
+		return ret;
+	}
+
+	tunnel->tun_hlen = 8;
+	tunnel->hlen = tunnel->tun_hlen + tunnel->encap_hlen +
+		       sizeof(struct erspanhdr);
+	t_hlen = tunnel->hlen + sizeof(struct ipv6hdr);
+
+	dev->hard_header_len = LL_MAX_HEADER + t_hlen;
+	dev->mtu = ETH_DATA_LEN - t_hlen;
+	if (dev->type == ARPHRD_ETHER)
+		dev->mtu -= ETH_HLEN;
+	if (!(tunnel->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
+		dev->mtu -= 8;
+
+	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
+	tunnel = netdev_priv(dev);
+	ip6gre_tnl_link_config(tunnel, 1);
+
+	return 0;
+}
+
+static const struct net_device_ops ip6erspan_netdev_ops = {
+	.ndo_init =		ip6erspan_tap_init,
+	.ndo_uninit =		ip6gre_tunnel_uninit,
+	.ndo_start_xmit =	ip6erspan_tunnel_xmit,
+	.ndo_set_mac_address =	eth_mac_addr,
+	.ndo_validate_addr =	eth_validate_addr,
+	.ndo_change_mtu =	ip6_tnl_change_mtu,
+	.ndo_get_stats64 =	ip_tunnel_get_stats64,
+	.ndo_get_iflink =	ip6_tnl_get_iflink,
+};
+
 static void ip6gre_tap_setup(struct net_device *dev)
 {
 
@@ -1372,8 +1754,13 @@ static int ip6gre_newlink(struct net *src_net, struct net_device *dev,
 
 	ip6gre_netlink_parms(data, &nt->parms);
 
-	if (ip6gre_tunnel_find(net, &nt->parms, dev->type))
-		return -EEXIST;
+	if (nt->parms.collect_md) {
+		if (rtnl_dereference(ign->collect_md_tun))
+			return -EEXIST;
+	} else {
+		if (ip6gre_tunnel_find(net, &nt->parms, dev->type))
+			return -EEXIST;
+	}
 
 	if (dev->type == ARPHRD_ETHER && !tb[IFLA_ADDRESS])
 		eth_hw_addr_random(dev);
@@ -1492,8 +1879,12 @@ static size_t ip6gre_get_size(const struct net_device *dev)
 		nla_total_size(2) +
 		/* IFLA_GRE_ENCAP_DPORT */
 		nla_total_size(2) +
+		/* IFLA_GRE_COLLECT_METADATA */
+		nla_total_size(0) +
 		/* IFLA_GRE_FWMARK */
 		nla_total_size(4) +
+		/* IFLA_GRE_ERSPAN_INDEX */
+		nla_total_size(4) +
 		0;
 }
 
@@ -1515,7 +1906,8 @@ static int ip6gre_fill_info(struct sk_buff *skb, const struct net_device *dev)
 	    nla_put_u8(skb, IFLA_GRE_ENCAP_LIMIT, p->encap_limit) ||
 	    nla_put_be32(skb, IFLA_GRE_FLOWINFO, p->flowinfo) ||
 	    nla_put_u32(skb, IFLA_GRE_FLAGS, p->flags) ||
-	    nla_put_u32(skb, IFLA_GRE_FWMARK, p->fwmark))
+	    nla_put_u32(skb, IFLA_GRE_FWMARK, p->fwmark) ||
+	    nla_put_u32(skb, IFLA_GRE_ERSPAN_INDEX, p->index))
 		goto nla_put_failure;
 
 	if (nla_put_u16(skb, IFLA_GRE_ENCAP_TYPE,
@@ -1528,6 +1920,11 @@ static int ip6gre_fill_info(struct sk_buff *skb, const struct net_device *dev)
 			t->encap.flags))
 		goto nla_put_failure;
 
+	if (p->collect_md) {
+		if (nla_put_flag(skb, IFLA_GRE_COLLECT_METADATA))
+			goto nla_put_failure;
+	}
+
 	return 0;
 
 nla_put_failure:
@@ -1550,9 +1947,25 @@ static const struct nla_policy ip6gre_policy[IFLA_GRE_MAX + 1] = {
 	[IFLA_GRE_ENCAP_FLAGS]  = { .type = NLA_U16 },
 	[IFLA_GRE_ENCAP_SPORT]  = { .type = NLA_U16 },
 	[IFLA_GRE_ENCAP_DPORT]  = { .type = NLA_U16 },
+	[IFLA_GRE_COLLECT_METADATA] = { .type = NLA_FLAG },
 	[IFLA_GRE_FWMARK]       = { .type = NLA_U32 },
+	[IFLA_GRE_ERSPAN_INDEX] = { .type = NLA_U32 },
 };
 
+static void ip6erspan_tap_setup(struct net_device *dev)
+{
+	ether_setup(dev);
+
+	dev->netdev_ops = &ip6erspan_netdev_ops;
+	dev->needs_free_netdev = true;
+	dev->priv_destructor = ip6gre_dev_free;
+
+	dev->features |= NETIF_F_NETNS_LOCAL;
+	dev->priv_flags &= ~IFF_TX_SKB_SHARING;
+	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
+	netif_keep_dst(dev);
+}
+
 static struct rtnl_link_ops ip6gre_link_ops __read_mostly = {
 	.kind		= "ip6gre",
 	.maxtype	= IFLA_GRE_MAX,
@@ -1582,6 +1995,20 @@ static struct rtnl_link_ops ip6gre_tap_ops __read_mostly = {
 	.get_link_net	= ip6_tnl_get_link_net,
 };
 
+static struct rtnl_link_ops ip6erspan_tap_ops __read_mostly = {
+	.kind		= "ip6erspan",
+	.maxtype	= IFLA_GRE_MAX,
+	.policy		= ip6gre_policy,
+	.priv_size	= sizeof(struct ip6_tnl),
+	.setup		= ip6erspan_tap_setup,
+	.validate	= ip6erspan_tap_validate,
+	.newlink	= ip6gre_newlink,
+	.changelink	= ip6gre_changelink,
+	.get_size	= ip6gre_get_size,
+	.fill_info	= ip6gre_fill_info,
+	.get_link_net	= ip6_tnl_get_link_net,
+};
+
 /*
  *	And now the modules code and kernel interface.
  */
@@ -1610,9 +2037,15 @@ static int __init ip6gre_init(void)
 	if (err < 0)
 		goto tap_ops_failed;
 
+	err = rtnl_link_register(&ip6erspan_tap_ops);
+	if (err < 0)
+		goto erspan_link_failed;
+
 out:
 	return err;
 
+erspan_link_failed:
+	rtnl_link_unregister(&ip6gre_tap_ops);
 tap_ops_failed:
 	rtnl_link_unregister(&ip6gre_link_ops);
 rtnl_link_failed:
@@ -1626,6 +2059,7 @@ static void __exit ip6gre_fini(void)
 {
 	rtnl_link_unregister(&ip6gre_tap_ops);
 	rtnl_link_unregister(&ip6gre_link_ops);
+	rtnl_link_unregister(&ip6erspan_tap_ops);
 	inet6_del_protocol(&ip6gre_protocol, IPPROTO_GRE);
 	unregister_pernet_device(&ip6gre_net_ops);
 }
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5110a418cc4d..176d74fb3b4d 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1201,13 +1201,13 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 		      rt->dst.dev->mtu : dst_mtu(&rt->dst);
 	else
 		mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
-		      rt->dst.dev->mtu : dst_mtu(rt->dst.path);
+		      rt->dst.dev->mtu : dst_mtu(xfrm_dst_path(&rt->dst));
 	if (np->frag_size < mtu) {
 		if (np->frag_size)
 			mtu = np->frag_size;
 	}
 	cork->base.fragsize = mtu;
-	if (dst_allfrag(rt->dst.path))
+	if (dst_allfrag(xfrm_dst_path(&rt->dst)))
 		cork->base.flags |= IPCORK_ALLFRAG;
 	cork->base.length = 0;
 
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index db84f523656d..6ff2f21ae3fc 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -861,7 +861,7 @@ int ip6_tnl_rcv(struct ip6_tnl *t, struct sk_buff *skb,
 		struct metadata_dst *tun_dst,
 		bool log_ecn_err)
 {
-	return __ip6_tnl_rcv(t, skb, tpi, NULL, ip6ip6_dscp_ecn_decapsulate,
+	return __ip6_tnl_rcv(t, skb, tpi, tun_dst, ip6ip6_dscp_ecn_decapsulate,
 			     log_ecn_err);
 }
 EXPORT_SYMBOL(ip6_tnl_rcv);
@@ -979,6 +979,9 @@ int ip6_tnl_xmit_ctl(struct ip6_tnl *t,
 	int ret = 0;
 	struct net *net = t->net;
 
+	if (t->parms.collect_md)
+		return 1;
+
 	if ((p->flags & IP6_TNL_F_CAP_XMIT) ||
 	    ((p->flags & IP6_TNL_F_CAP_PER_PACKET) &&
 	     (ip6_tnl_get_cap(t, laddr, raddr) & IP6_TNL_F_CAP_XMIT))) {
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index a2e1a864eb46..890f9bda06a4 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -1425,10 +1425,13 @@ int __init ip6_mr_init(void)
 		goto add_proto_fail;
 	}
 #endif
-	rtnl_register(RTNL_FAMILY_IP6MR, RTM_GETROUTE, NULL,
-		      ip6mr_rtm_dumproute, 0);
-	return 0;
+	err = rtnl_register_module(THIS_MODULE, RTNL_FAMILY_IP6MR, RTM_GETROUTE,
+				   NULL, ip6mr_rtm_dumproute, 0);
+	if (err == 0)
+		return 0;
+
 #ifdef CONFIG_IPV6_PIMSM_V2
+	inet6_del_protocol(&pim6_protocol, IPPROTO_PIM);
 add_proto_fail:
 	unregister_netdevice_notifier(&ip6_mr_notifier);
 #endif
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 7a8d1500d374..b3f4d19b3ca5 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -186,7 +186,7 @@ static void rt6_uncached_list_flush_dev(struct net *net, struct net_device *dev)
 
 static u32 *rt6_pcpu_cow_metrics(struct rt6_info *rt)
 {
-	return dst_metrics_write_ptr(rt->dst.from);
+	return dst_metrics_write_ptr(&rt->from->dst);
 }
 
 static u32 *ipv6_cow_metrics(struct dst_entry *dst, unsigned long old)
@@ -391,7 +391,7 @@ static void ip6_dst_destroy(struct dst_entry *dst)
 {
 	struct rt6_info *rt = (struct rt6_info *)dst;
 	struct rt6_exception_bucket *bucket;
-	struct dst_entry *from = dst->from;
+	struct rt6_info *from = rt->from;
 	struct inet6_dev *idev;
 
 	dst_destroy_metrics_generic(dst);
@@ -409,8 +409,8 @@ static void ip6_dst_destroy(struct dst_entry *dst)
 		kfree(bucket);
 	}
 
-	dst->from = NULL;
-	dst_release(from);
+	rt->from = NULL;
+	dst_release(&from->dst);
 }
 
 static void ip6_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
@@ -443,9 +443,9 @@ static bool rt6_check_expired(const struct rt6_info *rt)
 	if (rt->rt6i_flags & RTF_EXPIRES) {
 		if (time_after(jiffies, rt->dst.expires))
 			return true;
-	} else if (rt->dst.from) {
+	} else if (rt->from) {
 		return rt->dst.obsolete != DST_OBSOLETE_FORCE_CHK ||
-		       rt6_check_expired((struct rt6_info *)rt->dst.from);
+			rt6_check_expired(rt->from);
 	}
 	return false;
 }
@@ -502,7 +502,7 @@ static inline struct rt6_info *rt6_device_match(struct net *net,
 	if (!oif && ipv6_addr_any(saddr))
 		goto out;
 
-	for (sprt = rt; sprt; sprt = rcu_dereference(sprt->dst.rt6_next)) {
+	for (sprt = rt; sprt; sprt = rcu_dereference(sprt->rt6_next)) {
 		struct net_device *dev = sprt->dst.dev;
 
 		if (oif) {
@@ -721,7 +721,7 @@ static struct rt6_info *find_rr_leaf(struct fib6_node *fn,
 
 	match = NULL;
 	cont = NULL;
-	for (rt = rr_head; rt; rt = rcu_dereference(rt->dst.rt6_next)) {
+	for (rt = rr_head; rt; rt = rcu_dereference(rt->rt6_next)) {
 		if (rt->rt6i_metric != metric) {
 			cont = rt;
 			break;
@@ -731,7 +731,7 @@ static struct rt6_info *find_rr_leaf(struct fib6_node *fn,
 	}
 
 	for (rt = leaf; rt && rt != rr_head;
-	     rt = rcu_dereference(rt->dst.rt6_next)) {
+	     rt = rcu_dereference(rt->rt6_next)) {
 		if (rt->rt6i_metric != metric) {
 			cont = rt;
 			break;
@@ -743,7 +743,7 @@ static struct rt6_info *find_rr_leaf(struct fib6_node *fn,
 	if (match || !cont)
 		return match;
 
-	for (rt = cont; rt; rt = rcu_dereference(rt->dst.rt6_next))
+	for (rt = cont; rt; rt = rcu_dereference(rt->rt6_next))
 		match = find_match(rt, oif, strict, &mpri, match, do_rr);
 
 	return match;
@@ -781,7 +781,7 @@ static struct rt6_info *rt6_select(struct net *net, struct fib6_node *fn,
 			     &do_rr);
 
 	if (do_rr) {
-		struct rt6_info *next = rcu_dereference(rt0->dst.rt6_next);
+		struct rt6_info *next = rcu_dereference(rt0->rt6_next);
 
 		/* no entries matched; do round-robin */
 		if (!next || next->rt6i_metric != rt0->rt6i_metric)
@@ -1054,7 +1054,7 @@ static struct rt6_info *ip6_rt_cache_alloc(struct rt6_info *ort,
 	 */
 
 	if (ort->rt6i_flags & (RTF_CACHE | RTF_PCPU))
-		ort = (struct rt6_info *)ort->dst.from;
+		ort = ort->from;
 
 	rcu_read_lock();
 	dev = ip6_rt_get_dev_rcu(ort);
@@ -1274,7 +1274,7 @@ static int rt6_insert_exception(struct rt6_info *nrt,
 
 	/* ort can't be a cache or pcpu route */
 	if (ort->rt6i_flags & (RTF_CACHE | RTF_PCPU))
-		ort = (struct rt6_info *)ort->dst.from;
+		ort = ort->from;
 	WARN_ON_ONCE(ort->rt6i_flags & (RTF_CACHE | RTF_PCPU));
 
 	spin_lock_bh(&rt6_exception_lock);
@@ -1415,8 +1415,8 @@ static struct rt6_info *rt6_find_cached_rt(struct rt6_info *rt,
 /* Remove the passed in cached rt from the hash table that contains it */
 int rt6_remove_exception_rt(struct rt6_info *rt)
 {
-	struct rt6_info *from = (struct rt6_info *)rt->dst.from;
 	struct rt6_exception_bucket *bucket;
+	struct rt6_info *from = rt->from;
 	struct in6_addr *src_key = NULL;
 	struct rt6_exception *rt6_ex;
 	int err;
@@ -1460,8 +1460,8 @@ int rt6_remove_exception_rt(struct rt6_info *rt)
  */
 static void rt6_update_exception_stamp_rt(struct rt6_info *rt)
 {
-	struct rt6_info *from = (struct rt6_info *)rt->dst.from;
 	struct rt6_exception_bucket *bucket;
+	struct rt6_info *from = rt->from;
 	struct in6_addr *src_key = NULL;
 	struct rt6_exception *rt6_ex;
 
@@ -1929,9 +1929,9 @@ struct dst_entry *ip6_blackhole_route(struct net *net, struct dst_entry *dst_ori
 
 static void rt6_dst_from_metrics_check(struct rt6_info *rt)
 {
-	if (rt->dst.from &&
-	    dst_metrics_ptr(&rt->dst) != dst_metrics_ptr(rt->dst.from))
-		dst_init_metrics(&rt->dst, dst_metrics_ptr(rt->dst.from), true);
+	if (rt->from &&
+	    dst_metrics_ptr(&rt->dst) != dst_metrics_ptr(&rt->from->dst))
+		dst_init_metrics(&rt->dst, dst_metrics_ptr(&rt->from->dst), true);
 }
 
 static struct dst_entry *rt6_check(struct rt6_info *rt, u32 cookie)
@@ -1951,7 +1951,7 @@ static struct dst_entry *rt6_dst_from_check(struct rt6_info *rt, u32 cookie)
 {
 	if (!__rt6_check_expired(rt) &&
 	    rt->dst.obsolete == DST_OBSOLETE_FORCE_CHK &&
-	    rt6_check((struct rt6_info *)(rt->dst.from), cookie))
+	    rt6_check(rt->from, cookie))
 		return &rt->dst;
 	else
 		return NULL;
@@ -1971,7 +1971,7 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
 	rt6_dst_from_metrics_check(rt);
 
 	if (rt->rt6i_flags & RTF_PCPU ||
-	    (unlikely(!list_empty(&rt->rt6i_uncached)) && rt->dst.from))
+	    (unlikely(!list_empty(&rt->rt6i_uncached)) && rt->from))
 		return rt6_dst_from_check(rt, cookie);
 	else
 		return rt6_check(rt, cookie);
@@ -3055,11 +3055,11 @@ out:
 
 static void rt6_set_from(struct rt6_info *rt, struct rt6_info *from)
 {
-	BUG_ON(from->dst.from);
+	BUG_ON(from->from);
 
 	rt->rt6i_flags &= ~RTF_EXPIRES;
 	dst_hold(&from->dst);
-	rt->dst.from = &from->dst;
+	rt->from = from;
 	dst_init_metrics(&rt->dst, dst_metrics_ptr(&from->dst), true);
 }
 
@@ -4596,8 +4596,6 @@ static int __net_init ip6_route_net_init(struct net *net)
 					   GFP_KERNEL);
 	if (!net->ipv6.ip6_null_entry)
 		goto out_ip6_dst_entries;
-	net->ipv6.ip6_null_entry->dst.path =
-		(struct dst_entry *)net->ipv6.ip6_null_entry;
 	net->ipv6.ip6_null_entry->dst.ops = &net->ipv6.ip6_dst_ops;
 	dst_init_metrics(&net->ipv6.ip6_null_entry->dst,
 			 ip6_template_metrics, true);
@@ -4609,8 +4607,6 @@ static int __net_init ip6_route_net_init(struct net *net)
 					       GFP_KERNEL);
 	if (!net->ipv6.ip6_prohibit_entry)
 		goto out_ip6_null_entry;
-	net->ipv6.ip6_prohibit_entry->dst.path =
-		(struct dst_entry *)net->ipv6.ip6_prohibit_entry;
 	net->ipv6.ip6_prohibit_entry->dst.ops = &net->ipv6.ip6_dst_ops;
 	dst_init_metrics(&net->ipv6.ip6_prohibit_entry->dst,
 			 ip6_template_metrics, true);
@@ -4620,8 +4616,6 @@ static int __net_init ip6_route_net_init(struct net *net)
 					       GFP_KERNEL);
 	if (!net->ipv6.ip6_blk_hole_entry)
 		goto out_ip6_prohibit_entry;
-	net->ipv6.ip6_blk_hole_entry->dst.path =
-		(struct dst_entry *)net->ipv6.ip6_blk_hole_entry;
 	net->ipv6.ip6_blk_hole_entry->dst.ops = &net->ipv6.ip6_dst_ops;
 	dst_init_metrics(&net->ipv6.ip6_blk_hole_entry->dst,
 			 ip6_template_metrics, true);
@@ -4778,11 +4772,20 @@ int __init ip6_route_init(void)
 	if (ret)
 		goto fib6_rules_init;
 
-	ret = -ENOBUFS;
-	if (__rtnl_register(PF_INET6, RTM_NEWROUTE, inet6_rtm_newroute, NULL, 0) ||
-	    __rtnl_register(PF_INET6, RTM_DELROUTE, inet6_rtm_delroute, NULL, 0) ||
-	    __rtnl_register(PF_INET6, RTM_GETROUTE, inet6_rtm_getroute, NULL,
-			    RTNL_FLAG_DOIT_UNLOCKED))
+	ret = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_NEWROUTE,
+				   inet6_rtm_newroute, NULL, 0);
+	if (ret < 0)
+		goto out_register_late_subsys;
+
+	ret = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_DELROUTE,
+				   inet6_rtm_delroute, NULL, 0);
+	if (ret < 0)
+		goto out_register_late_subsys;
+
+	ret = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETROUTE,
+				   inet6_rtm_getroute, NULL,
+				   RTNL_FLAG_DOIT_UNLOCKED);
+	if (ret < 0)
 		goto out_register_late_subsys;
 
 	ret = register_netdevice_notifier(&ip6_route_dev_notifier);
@@ -4800,6 +4803,7 @@ out:
 	return ret;
 
 out_register_late_subsys:
+	rtnl_unregister_all(PF_INET6);
 	unregister_pernet_subsys(&ip6_route_net_late_ops);
 fib6_rules_init:
 	fib6_rules_cleanup();
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3f30fa313bf2..eecf9f0faf29 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -89,28 +89,12 @@ static u32 udp6_ehashfn(const struct net *net,
 			       udp_ipv6_hash_secret + net_hash_mix(net));
 }
 
-static u32 udp6_portaddr_hash(const struct net *net,
-			      const struct in6_addr *addr6,
-			      unsigned int port)
-{
-	unsigned int hash, mix = net_hash_mix(net);
-
-	if (ipv6_addr_any(addr6))
-		hash = jhash_1word(0, mix);
-	else if (ipv6_addr_v4mapped(addr6))
-		hash = jhash_1word((__force u32)addr6->s6_addr32[3], mix);
-	else
-		hash = jhash2((__force u32 *)addr6->s6_addr32, 4, mix);
-
-	return hash ^ port;
-}
-
 int udp_v6_get_port(struct sock *sk, unsigned short snum)
 {
 	unsigned int hash2_nulladdr =
-		udp6_portaddr_hash(sock_net(sk), &in6addr_any, snum);
+		ipv6_portaddr_hash(sock_net(sk), &in6addr_any, snum);
 	unsigned int hash2_partial =
-		udp6_portaddr_hash(sock_net(sk), &sk->sk_v6_rcv_saddr, 0);
+		ipv6_portaddr_hash(sock_net(sk), &sk->sk_v6_rcv_saddr, 0);
 
 	/* precompute partial secondary hash */
 	udp_sk(sk)->udp_portaddr_hash = hash2_partial;
@@ -119,7 +103,7 @@ int udp_v6_get_port(struct sock *sk, unsigned short snum)
 
 static void udp_v6_rehash(struct sock *sk)
 {
-	u16 new_hash = udp6_portaddr_hash(sock_net(sk),
+	u16 new_hash = ipv6_portaddr_hash(sock_net(sk),
 					  &sk->sk_v6_rcv_saddr,
 					  inet_sk(sk)->inet_num);
 
@@ -184,7 +168,7 @@ static struct sock *udp6_lib_lookup2(struct net *net,
 		struct udp_hslot *hslot2, struct sk_buff *skb)
 {
 	struct sock *sk, *result;
-	int score, badness, matches = 0, reuseport = 0;
+	int score, badness;
 	u32 hash = 0;
 
 	result = NULL;
@@ -193,8 +177,7 @@ static struct sock *udp6_lib_lookup2(struct net *net,
 		score = compute_score(sk, net, saddr, sport,
 				      daddr, hnum, dif, sdif, exact_dif);
 		if (score > badness) {
-			reuseport = sk->sk_reuseport;
-			if (reuseport) {
+			if (sk->sk_reuseport) {
 				hash = udp6_ehashfn(net, daddr, hnum,
 						    saddr, sport);
 
@@ -202,15 +185,9 @@ static struct sock *udp6_lib_lookup2(struct net *net,
 							sizeof(struct udphdr));
 				if (result)
 					return result;
-				matches = 1;
 			}
 			result = sk;
 			badness = score;
-		} else if (score == badness && reuseport) {
-			matches++;
-			if (reciprocal_scale(hash, matches) == 0)
-				result = sk;
-			hash = next_pseudo_random32(hash);
 		}
 	}
 	return result;
@@ -228,11 +205,11 @@ struct sock *__udp6_lib_lookup(struct net *net,
 	unsigned int hash2, slot2, slot = udp_hashfn(net, hnum, udptable->mask);
 	struct udp_hslot *hslot2, *hslot = &udptable->hash[slot];
 	bool exact_dif = udp6_lib_exact_dif_match(net, skb);
-	int score, badness, matches = 0, reuseport = 0;
+	int score, badness;
 	u32 hash = 0;
 
 	if (hslot->count > 10) {
-		hash2 = udp6_portaddr_hash(net, daddr, hnum);
+		hash2 = ipv6_portaddr_hash(net, daddr, hnum);
 		slot2 = hash2 & udptable->mask;
 		hslot2 = &udptable->hash2[slot2];
 		if (hslot->count < hslot2->count)
@@ -243,7 +220,7 @@ struct sock *__udp6_lib_lookup(struct net *net,
 					  hslot2, skb);
 		if (!result) {
 			unsigned int old_slot2 = slot2;
-			hash2 = udp6_portaddr_hash(net, &in6addr_any, hnum);
+			hash2 = ipv6_portaddr_hash(net, &in6addr_any, hnum);
 			slot2 = hash2 & udptable->mask;
 			/* avoid searching the same slot again. */
 			if (unlikely(slot2 == old_slot2))
@@ -267,23 +244,16 @@ begin:
 		score = compute_score(sk, net, saddr, sport, daddr, hnum, dif,
 				      sdif, exact_dif);
 		if (score > badness) {
-			reuseport = sk->sk_reuseport;
-			if (reuseport) {
+			if (sk->sk_reuseport) {
 				hash = udp6_ehashfn(net, daddr, hnum,
 						    saddr, sport);
 				result = reuseport_select_sock(sk, hash, skb,
 							sizeof(struct udphdr));
 				if (result)
 					return result;
-				matches = 1;
 			}
 			result = sk;
 			badness = score;
-		} else if (score == badness && reuseport) {
-			matches++;
-			if (reciprocal_scale(hash, matches) == 0)
-				result = sk;
-			hash = next_pseudo_random32(hash);
 		}
 	}
 	return result;
@@ -719,9 +689,9 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	struct sk_buff *nskb;
 
 	if (use_hash2) {
-		hash2_any = udp6_portaddr_hash(net, &in6addr_any, hnum) &
+		hash2_any = ipv6_portaddr_hash(net, &in6addr_any, hnum) &
 			    udptable->mask;
-		hash2 = udp6_portaddr_hash(net, daddr, hnum) & udptable->mask;
+		hash2 = ipv6_portaddr_hash(net, daddr, hnum) & udptable->mask;
 start_lookup:
 		hslot = &udptable->hash2[hash2];
 		offset = offsetof(typeof(*sk), __sk_common.skc_portaddr_node);
@@ -909,7 +879,7 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,
 			int dif, int sdif)
 {
 	unsigned short hnum = ntohs(loc_port);
-	unsigned int hash2 = udp6_portaddr_hash(net, loc_addr, hnum);
+	unsigned int hash2 = ipv6_portaddr_hash(net, loc_addr, hnum);
 	unsigned int slot2 = hash2 & udp_table.mask;
 	struct udp_hslot *hslot2 = &udp_table.hash2[slot2];
 	const __portpair ports = INET_COMBINED_PORTS(rmt_port, hnum);
diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c
index 02556e356f87..e66b94f46532 100644
--- a/net/ipv6/xfrm6_mode_tunnel.c
+++ b/net/ipv6/xfrm6_mode_tunnel.c
@@ -59,7 +59,7 @@ static int xfrm6_mode_tunnel_output(struct xfrm_state *x, struct sk_buff *skb)
 	if (x->props.flags & XFRM_STATE_NOECN)
 		dsfield &= ~INET_ECN_MASK;
 	ipv6_change_dsfield(top_iph, 0, dsfield);
-	top_iph->hop_limit = ip6_dst_hoplimit(dst->child);
+	top_iph->hop_limit = ip6_dst_hoplimit(xfrm_dst_child(dst));
 	top_iph->saddr = *(struct in6_addr *)&x->props.saddr;
 	top_iph->daddr = *(struct in6_addr *)&x->id.daddr;
 	return 0;
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index 885ade234a49..09fb44ee3b45 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -265,7 +265,7 @@ static void xfrm6_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
 			in6_dev_put(xdst->u.rt6.rt6i_idev);
 			xdst->u.rt6.rt6i_idev = loopback_idev;
 			in6_dev_hold(loopback_idev);
-			xdst = (struct xfrm_dst *)xdst->u.dst.child;
+			xdst = (struct xfrm_dst *)xfrm_dst_child(&xdst->u.dst);
 		} while (xdst->u.dst.xfrm);
 
 		__in6_dev_put(loopback_idev);
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 8ca9915befc8..5dce8336d33f 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -2510,12 +2510,15 @@ static int __init mpls_init(void)
 
 	rtnl_af_register(&mpls_af_ops);
 
-	rtnl_register(PF_MPLS, RTM_NEWROUTE, mpls_rtm_newroute, NULL, 0);
-	rtnl_register(PF_MPLS, RTM_DELROUTE, mpls_rtm_delroute, NULL, 0);
-	rtnl_register(PF_MPLS, RTM_GETROUTE, mpls_getroute, mpls_dump_routes,
-		      0);
-	rtnl_register(PF_MPLS, RTM_GETNETCONF, mpls_netconf_get_devconf,
-		      mpls_netconf_dump_devconf, 0);
+	rtnl_register_module(THIS_MODULE, PF_MPLS, RTM_NEWROUTE,
+			     mpls_rtm_newroute, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_MPLS, RTM_DELROUTE,
+			     mpls_rtm_delroute, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_MPLS, RTM_GETROUTE,
+			     mpls_getroute, mpls_dump_routes, 0);
+	rtnl_register_module(THIS_MODULE, PF_MPLS, RTM_GETNETCONF,
+			     mpls_netconf_get_devconf,
+			     mpls_netconf_dump_devconf, 0);
 	err = ipgre_tunnel_encap_add_mpls_ops();
 	if (err)
 		pr_err("Can't add mpls over gre tunnel ops\n");
diff --git a/net/netfilter/xt_policy.c b/net/netfilter/xt_policy.c
index 2b4ab189bba7..5639fb03bdd9 100644
--- a/net/netfilter/xt_policy.c
+++ b/net/netfilter/xt_policy.c
@@ -93,7 +93,8 @@ match_policy_out(const struct sk_buff *skb, const struct xt_policy_info *info,
 	if (dst->xfrm == NULL)
 		return -1;
 
-	for (i = 0; dst && dst->xfrm; dst = dst->child, i++) {
+	for (i = 0; dst && dst->xfrm;
+	     dst = ((struct xfrm_dst *)dst)->child, i++) {
 		pos = strict ? i : 0;
 		if (pos >= info->len)
 			return 0;
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index dbe2379329c5..76d050aba7a4 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -56,12 +56,12 @@
 
 u64 ovs_flow_used_time(unsigned long flow_jiffies)
 {
-	struct timespec cur_ts;
+	struct timespec64 cur_ts;
 	u64 cur_ms, idle_ms;
 
-	ktime_get_ts(&cur_ts);
+	ktime_get_ts64(&cur_ts);
 	idle_ms = jiffies_to_msecs(jiffies - flow_jiffies);
-	cur_ms = (u64)cur_ts.tv_sec * MSEC_PER_SEC +
+	cur_ms = (u64)(u32)cur_ts.tv_sec * MSEC_PER_SEC +
 		 cur_ts.tv_nsec / NSEC_PER_MSEC;
 
 	return cur_ms - idle_ms;
diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c
index 04a3128adcf0..3e7747549f90 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -126,18 +126,12 @@ internal_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	}
 }
 
-static void internal_set_rx_headroom(struct net_device *dev, int new_hr)
-{
-	dev->needed_headroom = new_hr < 0 ? 0 : new_hr;
-}
-
 static const struct net_device_ops internal_dev_netdev_ops = {
 	.ndo_open = internal_dev_open,
 	.ndo_stop = internal_dev_stop,
 	.ndo_start_xmit = internal_dev_xmit,
 	.ndo_set_mac_address = eth_mac_addr,
 	.ndo_get_stats64 = internal_get_stats,
-	.ndo_set_rx_headroom = internal_set_rx_headroom,
 };
 
 static struct rtnl_link_ops internal_dev_link_ops __read_mostly = {
@@ -154,7 +148,7 @@ static void do_setup(struct net_device *netdev)
 
 	netdev->priv_flags &= ~IFF_TX_SKB_SHARING;
 	netdev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_OPENVSWITCH |
-			      IFF_PHONY_HEADROOM | IFF_NO_QUEUE;
+			      IFF_NO_QUEUE;
 	netdev->needs_free_netdev = true;
 	netdev->priv_destructor = internal_dev_destructor;
 	netdev->ethtool_ops = &internal_dev_ethtool_ops;
@@ -195,7 +189,6 @@ static struct vport *internal_dev_create(const struct vport_parms *parms)
 		err = -ENOMEM;
 		goto error_free_netdev;
 	}
-	vport->dev->needed_headroom = vport->dp->max_headroom;
 
 	dev_net_set(vport->dev, ovs_dp_get_net(vport->dp));
 	internal_dev = internal_dev_priv(vport->dev);
diff --git a/net/phonet/pn_netlink.c b/net/phonet/pn_netlink.c
index da754fc926e7..871eaf2cb85e 100644
--- a/net/phonet/pn_netlink.c
+++ b/net/phonet/pn_netlink.c
@@ -299,16 +299,21 @@ out:
 
 int __init phonet_netlink_register(void)
 {
-	int err = __rtnl_register(PF_PHONET, RTM_NEWADDR, addr_doit,
-				  NULL, 0);
+	int err = rtnl_register_module(THIS_MODULE, PF_PHONET, RTM_NEWADDR,
+				       addr_doit, NULL, 0);
 	if (err)
 		return err;
 
-	/* Further __rtnl_register() cannot fail */
-	__rtnl_register(PF_PHONET, RTM_DELADDR, addr_doit, NULL, 0);
-	__rtnl_register(PF_PHONET, RTM_GETADDR, NULL, getaddr_dumpit, 0);
-	__rtnl_register(PF_PHONET, RTM_NEWROUTE, route_doit, NULL, 0);
-	__rtnl_register(PF_PHONET, RTM_DELROUTE, route_doit, NULL, 0);
-	__rtnl_register(PF_PHONET, RTM_GETROUTE, NULL, route_dumpit, 0);
+	/* Further rtnl_register_module() cannot fail */
+	rtnl_register_module(THIS_MODULE, PF_PHONET, RTM_DELADDR,
+			     addr_doit, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_PHONET, RTM_GETADDR,
+			     NULL, getaddr_dumpit, 0);
+	rtnl_register_module(THIS_MODULE, PF_PHONET, RTM_NEWROUTE,
+			     route_doit, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_PHONET, RTM_DELROUTE,
+			     route_doit, NULL, 0);
+	rtnl_register_module(THIS_MODULE, PF_PHONET, RTM_GETROUTE,
+			     NULL, route_dumpit, 0);
 	return 0;
 }
diff --git a/net/qrtr/qrtr.c b/net/qrtr/qrtr.c
index 77ab05e23001..5fb3929e3d7d 100644
--- a/net/qrtr/qrtr.c
+++ b/net/qrtr/qrtr.c
@@ -1116,9 +1116,13 @@ static int __init qrtr_proto_init(void)
 		return rc;
 	}
 
-	rtnl_register(PF_QIPCRTR, RTM_NEWADDR, qrtr_addr_doit, NULL, 0);
+	rc = rtnl_register_module(THIS_MODULE, PF_QIPCRTR, RTM_NEWADDR, qrtr_addr_doit, NULL, 0);
+	if (rc) {
+		sock_unregister(qrtr_family.family);
+		proto_unregister(&qrtr_proto);
+	}
 
-	return 0;
+	return rc;
 }
 postcore_initcall(qrtr_proto_init);
 
diff --git a/net/rds/connection.c b/net/rds/connection.c
index 7ee2d5d68b78..6492c0b608a4 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -230,8 +230,8 @@ static struct rds_connection *__rds_conn_create(struct net *net,
 
 	rdsdebug("allocated conn %p for %pI4 -> %pI4 over %s %s\n",
 	  conn, &laddr, &faddr,
-	  trans->t_name ? trans->t_name : "[unknown]",
-	  is_outgoing ? "(outgoing)" : "");
+	  strnlen(trans->t_name, sizeof(trans->t_name)) ? trans->t_name :
+	  "[unknown]", is_outgoing ? "(outgoing)" : "");
 
 	/*
 	 * Since we ran without holding the conn lock, someone could
@@ -366,6 +366,8 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
 	 * to the conn hash, so we never trigger a reconnect on this
 	 * conn - the reconnect is always triggered by the active peer. */
 	cancel_delayed_work_sync(&cp->cp_conn_w);
+	if (conn->c_destroy_in_prog)
+		return;
 	rcu_read_lock();
 	if (!hlist_unhashed(&conn->c_hash_node)) {
 		rcu_read_unlock();
@@ -445,7 +447,6 @@ void rds_conn_destroy(struct rds_connection *conn)
 	 */
 	rds_cong_remove_conn(conn);
 
-	put_net(conn->c_net);
 	kfree(conn->c_path);
 	kmem_cache_free(rds_conn_slab, conn);
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index c349c71babff..d09f6c1facb4 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -150,7 +150,7 @@ struct rds_connection {
 
 	/* Protocol version */
 	unsigned int		c_version;
-	struct net		*c_net;
+	possible_net_t		c_net;
 
 	struct list_head	c_map_item;
 	unsigned long		c_map_queued;
@@ -165,13 +165,13 @@ struct rds_connection {
 static inline
 struct net *rds_conn_net(struct rds_connection *conn)
 {
-	return conn->c_net;
+	return read_pnet(&conn->c_net);
 }
 
 static inline
 void rds_conn_net_set(struct rds_connection *conn, struct net *net)
 {
-	conn->c_net = get_net(net);
+	write_pnet(&conn->c_net, net);
 }
 
 #define RDS_FLAG_CONG_BITMAP	0x01
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 6b7ee71f40c6..39f502d47969 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -306,7 +306,8 @@ static void rds_tcp_conn_free(void *arg)
 	rdsdebug("freeing tc %p\n", tc);
 
 	spin_lock_irqsave(&rds_tcp_conn_lock, flags);
-	list_del(&tc->t_tcp_node);
+	if (!tc->t_tcp_node_detached)
+		list_del(&tc->t_tcp_node);
 	spin_unlock_irqrestore(&rds_tcp_conn_lock, flags);
 
 	kmem_cache_free(rds_tcp_conn_slab, tc);
@@ -495,27 +496,6 @@ static struct pernet_operations rds_tcp_net_ops = {
 	.size = sizeof(struct rds_tcp_net),
 };
 
-/* explicitly send a RST on each socket, thereby releasing any socket refcnts
- * that may otherwise hold up netns deletion.
- */
-static void rds_tcp_conn_paths_destroy(struct rds_connection *conn)
-{
-	struct rds_conn_path *cp;
-	struct rds_tcp_connection *tc;
-	int i;
-	struct sock *sk;
-
-	for (i = 0; i < RDS_MPATH_WORKERS; i++) {
-		cp = &conn->c_path[i];
-		tc = cp->cp_transport_data;
-		if (!tc->t_sock)
-			continue;
-		sk = tc->t_sock->sk;
-		sk->sk_prot->disconnect(sk, 0);
-		tcp_done(sk);
-	}
-}
-
 static void rds_tcp_kill_sock(struct net *net)
 {
 	struct rds_tcp_connection *tc, *_tc;
@@ -527,18 +507,20 @@ static void rds_tcp_kill_sock(struct net *net)
 	rds_tcp_listen_stop(lsock, &rtn->rds_tcp_accept_w);
 	spin_lock_irq(&rds_tcp_conn_lock);
 	list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
-		struct net *c_net = tc->t_cpath->cp_conn->c_net;
+		struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
 
 		if (net != c_net || !tc->t_sock)
 			continue;
-		if (!list_has_conn(&tmp_list, tc->t_cpath->cp_conn))
+		if (!list_has_conn(&tmp_list, tc->t_cpath->cp_conn)) {
 			list_move_tail(&tc->t_tcp_node, &tmp_list);
+		} else {
+			list_del(&tc->t_tcp_node);
+			tc->t_tcp_node_detached = true;
+		}
 	}
 	spin_unlock_irq(&rds_tcp_conn_lock);
-	list_for_each_entry_safe(tc, _tc, &tmp_list, t_tcp_node) {
-		rds_tcp_conn_paths_destroy(tc->t_cpath->cp_conn);
+	list_for_each_entry_safe(tc, _tc, &tmp_list, t_tcp_node)
 		rds_conn_destroy(tc->t_cpath->cp_conn);
-	}
 }
 
 void *rds_tcp_listen_sock_def_readable(struct net *net)
@@ -586,7 +568,7 @@ static void rds_tcp_sysctl_reset(struct net *net)
 
 	spin_lock_irq(&rds_tcp_conn_lock);
 	list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
-		struct net *c_net = tc->t_cpath->cp_conn->c_net;
+		struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
 
 		if (net != c_net || !tc->t_sock)
 			continue;
diff --git a/net/rds/tcp.h b/net/rds/tcp.h
index 1aafbf7c3011..e7858ee8ed8b 100644
--- a/net/rds/tcp.h
+++ b/net/rds/tcp.h
@@ -12,6 +12,7 @@ struct rds_tcp_incoming {
 struct rds_tcp_connection {
 
 	struct list_head	t_tcp_node;
+	bool			t_tcp_node_detached;
 	struct rds_conn_path	*t_cpath;
 	/* t_conn_path_lock synchronizes the connection establishment between
 	 * rds_tcp_accept_one and rds_tcp_conn_path_connect
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 4d33a50a8a6d..52622a3d2517 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -99,7 +99,7 @@ int __tcf_idr_release(struct tc_action *p, bool bind, bool strict)
 		p->tcfa_refcnt--;
 		if (p->tcfa_bindcnt <= 0 && p->tcfa_refcnt <= 0) {
 			if (p->ops->cleanup)
-				p->ops->cleanup(p, bind);
+				p->ops->cleanup(p);
 			tcf_idr_remove(p->idrinfo, p);
 			ret = ACT_P_DELETED;
 		}
diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index 5ef8ce8c83d4..e6c477fa9ca5 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -357,7 +357,7 @@ out:
 	return ret;
 }
 
-static void tcf_bpf_cleanup(struct tc_action *act, int bind)
+static void tcf_bpf_cleanup(struct tc_action *act)
 {
 	struct tcf_bpf_cfg tmp;
 
diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c
index 3007cb1310ea..dee9cf22686c 100644
--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -387,7 +387,7 @@ out_nlmsg_trim:
 }
 
 /* under ife->tcf_lock */
-static void _tcf_ife_cleanup(struct tc_action *a, int bind)
+static void _tcf_ife_cleanup(struct tc_action *a)
 {
 	struct tcf_ife_info *ife = to_ife(a);
 	struct tcf_meta_info *e, *n;
@@ -405,13 +405,13 @@ static void _tcf_ife_cleanup(struct tc_action *a, int bind)
 	}
 }
 
-static void tcf_ife_cleanup(struct tc_action *a, int bind)
+static void tcf_ife_cleanup(struct tc_action *a)
 {
 	struct tcf_ife_info *ife = to_ife(a);
 	struct tcf_ife_params *p;
 
 	spin_lock_bh(&ife->tcf_lock);
-	_tcf_ife_cleanup(a, bind);
+	_tcf_ife_cleanup(a);
 	spin_unlock_bh(&ife->tcf_lock);
 
 	p = rcu_dereference_protected(ife->params, 1);
@@ -546,7 +546,7 @@ metadata_parse_err:
 			if (exists)
 				tcf_idr_release(*a, bind);
 			if (ret == ACT_P_CREATED)
-				_tcf_ife_cleanup(*a, bind);
+				_tcf_ife_cleanup(*a);
 
 			if (exists)
 				spin_unlock_bh(&ife->tcf_lock);
@@ -567,7 +567,7 @@ metadata_parse_err:
 		err = use_all_metadata(ife);
 		if (err) {
 			if (ret == ACT_P_CREATED)
-				_tcf_ife_cleanup(*a, bind);
+				_tcf_ife_cleanup(*a);
 
 			if (exists)
 				spin_unlock_bh(&ife->tcf_lock);
diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c
index d9e399a7e3d5..2479b255dc1d 100644
--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -77,7 +77,7 @@ static void ipt_destroy_target(struct xt_entry_target *t)
 	module_put(par.target->me);
 }
 
-static void tcf_ipt_release(struct tc_action *a, int bind)
+static void tcf_ipt_release(struct tc_action *a)
 {
 	struct tcf_ipt *ipt = to_ipt(a);
 	ipt_destroy_target(ipt->tcfi_t);
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 8b3e59388480..cee2d413bf57 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -29,7 +29,6 @@
 #include <net/tc_act/tc_mirred.h>
 
 static LIST_HEAD(mirred_list);
-static DEFINE_SPINLOCK(mirred_list_lock);
 
 static bool tcf_mirred_is_act_redirect(int action)
 {
@@ -50,18 +49,15 @@ static bool tcf_mirred_act_wants_ingress(int action)
 	}
 }
 
-static void tcf_mirred_release(struct tc_action *a, int bind)
+static void tcf_mirred_release(struct tc_action *a)
 {
 	struct tcf_mirred *m = to_mirred(a);
 	struct net_device *dev;
 
-	/* We could be called either in a RCU callback or with RTNL lock held. */
-	spin_lock_bh(&mirred_list_lock);
 	list_del(&m->tcfm_list);
-	dev = rcu_dereference_protected(m->tcfm_dev, 1);
+	dev = rtnl_dereference(m->tcfm_dev);
 	if (dev)
 		dev_put(dev);
-	spin_unlock_bh(&mirred_list_lock);
 }
 
 static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = {
@@ -139,8 +135,6 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 	m->tcf_action = parm->action;
 	m->tcfm_eaction = parm->eaction;
 	if (dev != NULL) {
-		m->tcfm_ifindex = parm->ifindex;
-		m->net = net;
 		if (ret != ACT_P_CREATED)
 			dev_put(rcu_dereference_protected(m->tcfm_dev, 1));
 		dev_hold(dev);
@@ -149,9 +143,7 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 	}
 
 	if (ret == ACT_P_CREATED) {
-		spin_lock_bh(&mirred_list_lock);
 		list_add(&m->tcfm_list, &mirred_list);
-		spin_unlock_bh(&mirred_list_lock);
 		tcf_idr_insert(tn, *a);
 	}
 
@@ -247,13 +239,14 @@ static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 {
 	unsigned char *b = skb_tail_pointer(skb);
 	struct tcf_mirred *m = to_mirred(a);
+	struct net_device *dev = rtnl_dereference(m->tcfm_dev);
 	struct tc_mirred opt = {
 		.index   = m->tcf_index,
 		.action  = m->tcf_action,
 		.refcnt  = m->tcf_refcnt - ref,
 		.bindcnt = m->tcf_bindcnt - bind,
 		.eaction = m->tcfm_eaction,
-		.ifindex = m->tcfm_ifindex,
+		.ifindex = dev ? dev->ifindex : 0,
 	};
 	struct tcf_t t;
 
@@ -294,7 +287,6 @@ static int mirred_device_event(struct notifier_block *unused,
 
 	ASSERT_RTNL();
 	if (event == NETDEV_UNREGISTER) {
-		spin_lock_bh(&mirred_list_lock);
 		list_for_each_entry(m, &mirred_list, tcfm_list) {
 			if (rcu_access_pointer(m->tcfm_dev) == dev) {
 				dev_put(dev);
@@ -304,7 +296,6 @@ static int mirred_device_event(struct notifier_block *unused,
 				RCU_INIT_POINTER(m->tcfm_dev, NULL);
 			}
 		}
-		spin_unlock_bh(&mirred_list_lock);
 	}
 
 	return NOTIFY_DONE;
@@ -318,7 +309,7 @@ static struct net_device *tcf_mirred_get_dev(const struct tc_action *a)
 {
 	struct tcf_mirred *m = to_mirred(a);
 
-	return __dev_get_by_index(m->net, m->tcfm_ifindex);
+	return rtnl_dereference(m->tcfm_dev);
 }
 
 static struct tc_action_ops act_mirred_ops = {
diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
index 491fe5deb09e..dba996bcd6dc 100644
--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -216,7 +216,7 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla,
 	return ret;
 }
 
-static void tcf_pedit_cleanup(struct tc_action *a, int bind)
+static void tcf_pedit_cleanup(struct tc_action *a)
 {
 	struct tcf_pedit *p = to_pedit(a);
 	struct tc_pedit_key *keys = p->tcfp_keys;
diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index 9438969290a6..859a93903339 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -96,7 +96,7 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla,
 	return ret;
 }
 
-static void tcf_sample_cleanup(struct tc_action *a, int bind)
+static void tcf_sample_cleanup(struct tc_action *a)
 {
 	struct tcf_sample *s = to_sample(a);
 	struct psample_group *psample_group;
diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
index e7b57e5071a3..eda57b47a6b6 100644
--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -47,7 +47,7 @@ static int tcf_simp(struct sk_buff *skb, const struct tc_action *a,
 	return d->tcf_action;
 }
 
-static void tcf_simp_release(struct tc_action *a, int bind)
+static void tcf_simp_release(struct tc_action *a)
 {
 	struct tcf_defact *d = to_defact(a);
 	kfree(d->tcfd_defdata);
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c
index b642ad3d39dd..f090bba1a79e 100644
--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -184,7 +184,7 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
 	return ret;
 }
 
-static void tcf_skbmod_cleanup(struct tc_action *a, int bind)
+static void tcf_skbmod_cleanup(struct tc_action *a)
 {
 	struct tcf_skbmod *d = to_skbmod(a);
 	struct tcf_skbmod_params  *p;
diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index 30c96274c638..57b63bdec3ae 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -201,7 +201,7 @@ err_out:
 	return ret;
 }
 
-static void tunnel_key_release(struct tc_action *a, int bind)
+static void tunnel_key_release(struct tc_action *a)
 {
 	struct tcf_tunnel_key *t = to_tunnel_key(a);
 	struct tcf_tunnel_key_params *params;
diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
index 97f717a13ad5..41f0878ad26e 100644
--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -219,7 +219,7 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
 	return ret;
 }
 
-static void tcf_vlan_cleanup(struct tc_action *a, int bind)
+static void tcf_vlan_cleanup(struct tc_action *a)
 {
 	struct tcf_vlan *v = to_vlan(a);
 	struct tcf_vlan_params *p;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index ddcf04b4ab43..5b9b8a61e8c4 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -218,8 +218,12 @@ static void tcf_chain_flush(struct tcf_chain *chain)
 
 static void tcf_chain_destroy(struct tcf_chain *chain)
 {
+	struct tcf_block *block = chain->block;
+
 	list_del(&chain->list);
 	kfree(chain);
+	if (list_empty(&block->chain_list))
+		kfree(block);
 }
 
 static void tcf_chain_hold(struct tcf_chain *chain)
@@ -330,47 +334,32 @@ int tcf_block_get(struct tcf_block **p_block,
 }
 EXPORT_SYMBOL(tcf_block_get);
 
-static void tcf_block_put_final(struct work_struct *work)
-{
-	struct tcf_block *block = container_of(work, struct tcf_block, work);
-	struct tcf_chain *chain, *tmp;
-
-	rtnl_lock();
-
-	/* At this point, all the chains should have refcnt == 1. */
-	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
-		tcf_chain_put(chain);
-	rtnl_unlock();
-	kfree(block);
-}
-
 /* XXX: Standalone actions are not allowed to jump to any chain, and bound
  * actions should be all removed after flushing.
  */
 void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 		       struct tcf_block_ext_info *ei)
 {
-	struct tcf_chain *chain;
+	struct tcf_chain *chain, *tmp;
 
-	/* Hold a refcnt for all chains, except 0, so that they don't disappear
+	/* Hold a refcnt for all chains, so that they don't disappear
 	 * while we are iterating.
 	 */
 	list_for_each_entry(chain, &block->chain_list, list)
-		if (chain->index)
-			tcf_chain_hold(chain);
+		tcf_chain_hold(chain);
 
 	list_for_each_entry(chain, &block->chain_list, list)
 		tcf_chain_flush(chain);
 
 	tcf_block_offload_unbind(block, q, ei);
 
-	INIT_WORK(&block->work, tcf_block_put_final);
-	/* Wait for existing RCU callbacks to cool down, make sure their works
-	 * have been queued before this. We can not flush pending works here
-	 * because we are holding the RTNL lock.
-	 */
-	rcu_barrier();
-	tcf_queue_work(&block->work);
+	/* At this point, all the chains should have refcnt >= 1. */
+	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+		tcf_chain_put(chain);
+
+	/* Finally, put chain 0 and allow block to be freed. */
+	chain = list_first_entry(&block->chain_list, struct tcf_chain, list);
+	tcf_chain_put(chain);
 }
 EXPORT_SYMBOL(tcf_block_put_ext);
 
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 543a3e875d05..6132a7317efa 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -166,6 +166,7 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 	 * so do it rather here.
 	 */
 	skb_key.basic.n_proto = skb->protocol;
+	skb_flow_dissect_tunnel_info(skb, &head->dissector, &skb_key);
 	skb_flow_dissect(skb, &head->dissector, &skb_key, 0);
 
 	fl_set_masked_key(&skb_mkey, &skb_key, &head->mask);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index b6c4f536876b..a904276b657d 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -797,7 +797,8 @@ static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid,
 		goto nla_put_failure;
 	if (q->ops->dump && q->ops->dump(q, skb) < 0)
 		goto nla_put_failure;
-	qlen = q->q.qlen;
+
+	qlen = qdisc_qlen_sum(q);
 
 	stab = rtnl_dereference(q->stab);
 	if (stab && qdisc_dump_stab(skb, stab) < 0)
@@ -954,6 +955,11 @@ skip:
 	} else {
 		const struct Qdisc_class_ops *cops = parent->ops->cl_ops;
 
+		/* Only support running class lockless if parent is lockless */
+		if (new && (new->flags & TCQ_F_NOLOCK) &&
+		    parent && !(parent->flags & TCQ_F_NOLOCK))
+			new->flags &= ~TCQ_F_NOLOCK;
+
 		err = -EOPNOTSUPP;
 		if (cops && cops->graft) {
 			unsigned long cl = cops->find(parent, classid);
@@ -1020,7 +1026,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
 #endif
 
 	err = -ENOENT;
-	if (ops == NULL)
+	if (!ops)
 		goto err_out;
 
 	sch = qdisc_alloc(dev_queue, ops);
@@ -1060,54 +1066,60 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
 		netdev_info(dev, "Caught tx_queue_len zero misconfig\n");
 	}
 
-	if (!ops->init || (err = ops->init(sch, tca[TCA_OPTIONS])) == 0) {
-		if (qdisc_is_percpu_stats(sch)) {
-			sch->cpu_bstats =
-				netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu);
-			if (!sch->cpu_bstats)
-				goto err_out4;
+	if (ops->init) {
+		err = ops->init(sch, tca[TCA_OPTIONS]);
+		if (err != 0)
+			goto err_out5;
+	}
 
-			sch->cpu_qstats = alloc_percpu(struct gnet_stats_queue);
-			if (!sch->cpu_qstats)
-				goto err_out4;
-		}
+	if (qdisc_is_percpu_stats(sch)) {
+		sch->cpu_bstats =
+			netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu);
+		if (!sch->cpu_bstats)
+			goto err_out4;
 
-		if (tca[TCA_STAB]) {
-			stab = qdisc_get_stab(tca[TCA_STAB]);
-			if (IS_ERR(stab)) {
-				err = PTR_ERR(stab);
-				goto err_out4;
-			}
-			rcu_assign_pointer(sch->stab, stab);
-		}
-		if (tca[TCA_RATE]) {
-			seqcount_t *running;
-
-			err = -EOPNOTSUPP;
-			if (sch->flags & TCQ_F_MQROOT)
-				goto err_out4;
+		sch->cpu_qstats = alloc_percpu(struct gnet_stats_queue);
+		if (!sch->cpu_qstats)
+			goto err_out4;
+	}
 
-			if ((sch->parent != TC_H_ROOT) &&
-			    !(sch->flags & TCQ_F_INGRESS) &&
-			    (!p || !(p->flags & TCQ_F_MQROOT)))
-				running = qdisc_root_sleeping_running(sch);
-			else
-				running = &sch->running;
-
-			err = gen_new_estimator(&sch->bstats,
-						sch->cpu_bstats,
-						&sch->rate_est,
-						NULL,
-						running,
-						tca[TCA_RATE]);
-			if (err)
-				goto err_out4;
+	if (tca[TCA_STAB]) {
+		stab = qdisc_get_stab(tca[TCA_STAB]);
+		if (IS_ERR(stab)) {
+			err = PTR_ERR(stab);
+			goto err_out4;
 		}
+		rcu_assign_pointer(sch->stab, stab);
+	}
+	if (tca[TCA_RATE]) {
+		seqcount_t *running;
 
-		qdisc_hash_add(sch, false);
+		err = -EOPNOTSUPP;
+		if (sch->flags & TCQ_F_MQROOT)
+			goto err_out4;
 
-		return sch;
+		if (sch->parent != TC_H_ROOT &&
+		    !(sch->flags & TCQ_F_INGRESS) &&
+		    (!p || !(p->flags & TCQ_F_MQROOT)))
+			running = qdisc_root_sleeping_running(sch);
+		else
+			running = &sch->running;
+
+		err = gen_new_estimator(&sch->bstats,
+					sch->cpu_bstats,
+					&sch->rate_est,
+					NULL,
+					running,
+					tca[TCA_RATE]);
+		if (err)
+			goto err_out4;
 	}
+
+	qdisc_hash_add(sch, false);
+
+	return sch;
+
+err_out5:
 	/* ops->init() failed, we call ->destroy() like qdisc_create_dflt() */
 	if (ops->destroy)
 		ops->destroy(sch);
@@ -1139,7 +1151,7 @@ static int qdisc_change(struct Qdisc *sch, struct nlattr **tca)
 	int err = 0;
 
 	if (tca[TCA_OPTIONS]) {
-		if (sch->ops->change == NULL)
+		if (!sch->ops->change)
 			return -EINVAL;
 		err = sch->ops->change(sch, tca[TCA_OPTIONS]);
 		if (err)
@@ -1344,7 +1356,8 @@ replay:
 					goto create_n_graft;
 				if (n->nlmsg_flags & NLM_F_EXCL)
 					return -EEXIST;
-				if (tca[TCA_KIND] && nla_strcmp(tca[TCA_KIND], q->ops->id))
+				if (tca[TCA_KIND] &&
+				    nla_strcmp(tca[TCA_KIND], q->ops->id))
 					return -EINVAL;
 				if (q == p ||
 				    (p && check_loop(q, p, 0)))
@@ -1389,7 +1402,7 @@ replay:
 	}
 
 	/* Change qdisc parameters */
-	if (q == NULL)
+	if (!q)
 		return -ENOENT;
 	if (n->nlmsg_flags & NLM_F_EXCL)
 		return -EEXIST;
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cd1b200acae7..981c08fe810b 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -26,6 +26,7 @@
 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/if_vlan.h>
+#include <linux/skb_array.h>
 #include <linux/if_macvlan.h>
 #include <net/sch_generic.h>
 #include <net/pkt_sched.h>
@@ -47,9 +48,70 @@ EXPORT_SYMBOL(default_qdisc_ops);
  * - updates to tree and tree walking are only done under the rtnl mutex.
  */
 
-static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
+static inline struct sk_buff *__skb_dequeue_bad_txq(struct Qdisc *q)
+{
+	const struct netdev_queue *txq = q->dev_queue;
+	spinlock_t *lock = NULL;
+	struct sk_buff *skb;
+
+	if (q->flags & TCQ_F_NOLOCK) {
+		lock = qdisc_lock(q);
+		spin_lock(lock);
+	}
+
+	skb = skb_peek(&q->skb_bad_txq);
+	if (skb) {
+		/* check the reason of requeuing without tx lock first */
+		txq = skb_get_tx_queue(txq->dev, skb);
+		if (!netif_xmit_frozen_or_stopped(txq)) {
+			skb = __skb_dequeue(&q->skb_bad_txq);
+			if (qdisc_is_percpu_stats(q)) {
+				qdisc_qstats_cpu_backlog_dec(q, skb);
+				qdisc_qstats_cpu_qlen_dec(q);
+			} else {
+				qdisc_qstats_backlog_dec(q, skb);
+				q->q.qlen--;
+			}
+		} else {
+			skb = NULL;
+		}
+	}
+
+	if (lock)
+		spin_unlock(lock);
+
+	return skb;
+}
+
+static inline struct sk_buff *qdisc_dequeue_skb_bad_txq(struct Qdisc *q)
+{
+	struct sk_buff *skb = skb_peek(&q->skb_bad_txq);
+
+	if (unlikely(skb))
+		skb = __skb_dequeue_bad_txq(q);
+
+	return skb;
+}
+
+static inline void qdisc_enqueue_skb_bad_txq(struct Qdisc *q,
+					     struct sk_buff *skb)
 {
-	q->gso_skb = skb;
+	spinlock_t *lock = NULL;
+
+	if (q->flags & TCQ_F_NOLOCK) {
+		lock = qdisc_lock(q);
+		spin_lock(lock);
+	}
+
+	__skb_queue_tail(&q->skb_bad_txq, skb);
+
+	if (lock)
+		spin_unlock(lock);
+}
+
+static inline int __dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
+{
+	__skb_queue_head(&q->gso_skb, skb);
 	q->qstats.requeues++;
 	qdisc_qstats_backlog_inc(q, skb);
 	q->q.qlen++;	/* it's still part of the queue */
@@ -58,6 +120,30 @@ static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
 	return 0;
 }
 
+static inline int dev_requeue_skb_locked(struct sk_buff *skb, struct Qdisc *q)
+{
+	spinlock_t *lock = qdisc_lock(q);
+
+	spin_lock(lock);
+	__skb_queue_tail(&q->gso_skb, skb);
+	spin_unlock(lock);
+
+	qdisc_qstats_cpu_requeues_inc(q);
+	qdisc_qstats_cpu_backlog_inc(q, skb);
+	qdisc_qstats_cpu_qlen_inc(q);
+	__netif_schedule(q);
+
+	return 0;
+}
+
+static inline int dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
+{
+	if (q->flags & TCQ_F_NOLOCK)
+		return dev_requeue_skb_locked(skb, q);
+	else
+		return __dev_requeue_skb(skb, q);
+}
+
 static void try_bulk_dequeue_skb(struct Qdisc *q,
 				 struct sk_buff *skb,
 				 const struct netdev_queue *txq,
@@ -95,9 +181,15 @@ static void try_bulk_dequeue_skb_slow(struct Qdisc *q,
 		if (!nskb)
 			break;
 		if (unlikely(skb_get_queue_mapping(nskb) != mapping)) {
-			q->skb_bad_txq = nskb;
-			qdisc_qstats_backlog_inc(q, nskb);
-			q->q.qlen++;
+			qdisc_enqueue_skb_bad_txq(q, nskb);
+
+			if (qdisc_is_percpu_stats(q)) {
+				qdisc_qstats_cpu_backlog_inc(q, nskb);
+				qdisc_qstats_cpu_qlen_inc(q);
+			} else {
+				qdisc_qstats_backlog_inc(q, nskb);
+				q->q.qlen++;
+			}
 			break;
 		}
 		skb->next = nskb;
@@ -113,40 +205,60 @@ static void try_bulk_dequeue_skb_slow(struct Qdisc *q,
 static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate,
 				   int *packets)
 {
-	struct sk_buff *skb = q->gso_skb;
 	const struct netdev_queue *txq = q->dev_queue;
+	struct sk_buff *skb = NULL;
 
 	*packets = 1;
-	if (unlikely(skb)) {
+	if (unlikely(!skb_queue_empty(&q->gso_skb))) {
+		spinlock_t *lock = NULL;
+
+		if (q->flags & TCQ_F_NOLOCK) {
+			lock = qdisc_lock(q);
+			spin_lock(lock);
+		}
+
+		skb = skb_peek(&q->gso_skb);
+
+		/* skb may be null if another cpu pulls gso_skb off in between
+		 * empty check and lock.
+		 */
+		if (!skb) {
+			if (lock)
+				spin_unlock(lock);
+			goto validate;
+		}
+
 		/* skb in gso_skb were already validated */
 		*validate = false;
 		/* check the reason of requeuing without tx lock first */
 		txq = skb_get_tx_queue(txq->dev, skb);
 		if (!netif_xmit_frozen_or_stopped(txq)) {
-			q->gso_skb = NULL;
-			qdisc_qstats_backlog_dec(q, skb);
-			q->q.qlen--;
-		} else
+			skb = __skb_dequeue(&q->gso_skb);
+			if (qdisc_is_percpu_stats(q)) {
+				qdisc_qstats_cpu_backlog_dec(q, skb);
+				qdisc_qstats_cpu_qlen_dec(q);
+			} else {
+				qdisc_qstats_backlog_dec(q, skb);
+				q->q.qlen--;
+			}
+		} else {
 			skb = NULL;
-		goto trace;
-	}
-	*validate = true;
-	skb = q->skb_bad_txq;
-	if (unlikely(skb)) {
-		/* check the reason of requeuing without tx lock first */
-		txq = skb_get_tx_queue(txq->dev, skb);
-		if (!netif_xmit_frozen_or_stopped(txq)) {
-			q->skb_bad_txq = NULL;
-			qdisc_qstats_backlog_dec(q, skb);
-			q->q.qlen--;
-			goto bulk;
 		}
-		skb = NULL;
+		if (lock)
+			spin_unlock(lock);
 		goto trace;
 	}
-	if (!(q->flags & TCQ_F_ONETXQUEUE) ||
-	    !netif_xmit_frozen_or_stopped(txq))
-		skb = q->dequeue(q);
+validate:
+	*validate = true;
+
+	if ((q->flags & TCQ_F_ONETXQUEUE) &&
+	    netif_xmit_frozen_or_stopped(txq))
+		return skb;
+
+	skb = qdisc_dequeue_skb_bad_txq(q);
+	if (unlikely(skb))
+		goto bulk;
+	skb = q->dequeue(q);
 	if (skb) {
 bulk:
 		if (qdisc_may_bulk(q))
@@ -165,17 +277,18 @@ trace:
  * only one CPU can execute this function.
  *
  * Returns to the caller:
- *				0  - queue is empty or throttled.
- *				>0 - queue is not empty.
+ *				false  - hardware queue frozen backoff
+ *				true   - feel free to send more pkts
  */
-int sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
-		    struct net_device *dev, struct netdev_queue *txq,
-		    spinlock_t *root_lock, bool validate)
+bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
+		     struct net_device *dev, struct netdev_queue *txq,
+		     spinlock_t *root_lock, bool validate)
 {
 	int ret = NETDEV_TX_BUSY;
 
 	/* And release qdisc */
-	spin_unlock(root_lock);
+	if (root_lock)
+		spin_unlock(root_lock);
 
 	/* Note that we validate skb (GSO, checksum, ...) outside of locks */
 	if (validate)
@@ -188,27 +301,28 @@ int sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
 
 		HARD_TX_UNLOCK(dev, txq);
 	} else {
-		spin_lock(root_lock);
-		return qdisc_qlen(q);
+		if (root_lock)
+			spin_lock(root_lock);
+		return true;
 	}
-	spin_lock(root_lock);
 
-	if (dev_xmit_complete(ret)) {
-		/* Driver sent out skb successfully or skb was consumed */
-		ret = qdisc_qlen(q);
-	} else {
+	if (root_lock)
+		spin_lock(root_lock);
+
+	if (!dev_xmit_complete(ret)) {
 		/* Driver returned NETDEV_TX_BUSY - requeue skb */
 		if (unlikely(ret != NETDEV_TX_BUSY))
 			net_warn_ratelimited("BUG %s code %d qlen %d\n",
 					     dev->name, ret, q->q.qlen);
 
-		ret = dev_requeue_skb(skb, q);
+		dev_requeue_skb(skb, q);
+		return false;
 	}
 
 	if (ret && netif_xmit_frozen_or_stopped(txq))
-		ret = 0;
+		return false;
 
-	return ret;
+	return true;
 }
 
 /*
@@ -230,20 +344,22 @@ int sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
  *				>0 - queue is not empty.
  *
  */
-static inline int qdisc_restart(struct Qdisc *q, int *packets)
+static inline bool qdisc_restart(struct Qdisc *q, int *packets)
 {
+	spinlock_t *root_lock = NULL;
 	struct netdev_queue *txq;
 	struct net_device *dev;
-	spinlock_t *root_lock;
 	struct sk_buff *skb;
 	bool validate;
 
 	/* Dequeue packet */
 	skb = dequeue_skb(q, &validate, packets);
 	if (unlikely(!skb))
-		return 0;
+		return false;
+
+	if (!(q->flags & TCQ_F_NOLOCK))
+		root_lock = qdisc_lock(q);
 
-	root_lock = qdisc_lock(q);
 	dev = qdisc_dev(q);
 	txq = skb_get_tx_queue(dev, skb);
 
@@ -267,8 +383,6 @@ void __qdisc_run(struct Qdisc *q)
 			break;
 		}
 	}
-
-	qdisc_run_end(q);
 }
 
 unsigned long dev_trans_start(struct net_device *dev)
@@ -468,93 +582,93 @@ static const u8 prio2band[TC_PRIO_MAX + 1] = {
 
 /*
  * Private data for a pfifo_fast scheduler containing:
- * 	- queues for the three band
- * 	- bitmap indicating which of the bands contain skbs
+ *	- rings for priority bands
  */
 struct pfifo_fast_priv {
-	u32 bitmap;
-	struct qdisc_skb_head q[PFIFO_FAST_BANDS];
+	struct skb_array q[PFIFO_FAST_BANDS];
 };
 
-/*
- * Convert a bitmap to the first band number where an skb is queued, where:
- * 	bitmap=0 means there are no skbs on any band.
- * 	bitmap=1 means there is an skb on band 0.
- *	bitmap=7 means there are skbs on all 3 bands, etc.
- */
-static const int bitmap2band[] = {-1, 0, 1, 0, 2, 0, 1, 0};
-
-static inline struct qdisc_skb_head *band2list(struct pfifo_fast_priv *priv,
-					     int band)
+static inline struct skb_array *band2list(struct pfifo_fast_priv *priv,
+					  int band)
 {
-	return priv->q + band;
+	return &priv->q[band];
 }
 
 static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc,
 			      struct sk_buff **to_free)
 {
-	if (qdisc->q.qlen < qdisc_dev(qdisc)->tx_queue_len) {
-		int band = prio2band[skb->priority & TC_PRIO_MAX];
-		struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
-		struct qdisc_skb_head *list = band2list(priv, band);
-
-		priv->bitmap |= (1 << band);
-		qdisc->q.qlen++;
-		return __qdisc_enqueue_tail(skb, qdisc, list);
-	}
+	int band = prio2band[skb->priority & TC_PRIO_MAX];
+	struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
+	struct skb_array *q = band2list(priv, band);
+	int err;
+
+	err = skb_array_produce(q, skb);
 
-	return qdisc_drop(skb, qdisc, to_free);
+	if (unlikely(err))
+		return qdisc_drop_cpu(skb, qdisc, to_free);
+
+	qdisc_qstats_cpu_qlen_inc(qdisc);
+	qdisc_qstats_cpu_backlog_inc(qdisc, skb);
+	return NET_XMIT_SUCCESS;
 }
 
 static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
 {
 	struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
-	int band = bitmap2band[priv->bitmap];
-
-	if (likely(band >= 0)) {
-		struct qdisc_skb_head *qh = band2list(priv, band);
-		struct sk_buff *skb = __qdisc_dequeue_head(qh);
+	struct sk_buff *skb = NULL;
+	int band;
 
-		if (likely(skb != NULL)) {
-			qdisc_qstats_backlog_dec(qdisc, skb);
-			qdisc_bstats_update(qdisc, skb);
-		}
+	for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) {
+		struct skb_array *q = band2list(priv, band);
 
-		qdisc->q.qlen--;
-		if (qh->qlen == 0)
-			priv->bitmap &= ~(1 << band);
+		if (__skb_array_empty(q))
+			continue;
 
-		return skb;
+		skb = skb_array_consume_bh(q);
+	}
+	if (likely(skb)) {
+		qdisc_qstats_cpu_backlog_dec(qdisc, skb);
+		qdisc_bstats_cpu_update(qdisc, skb);
+		qdisc_qstats_cpu_qlen_dec(qdisc);
 	}
 
-	return NULL;
+	return skb;
 }
 
 static struct sk_buff *pfifo_fast_peek(struct Qdisc *qdisc)
 {
 	struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
-	int band = bitmap2band[priv->bitmap];
+	struct sk_buff *skb = NULL;
+	int band;
 
-	if (band >= 0) {
-		struct qdisc_skb_head *qh = band2list(priv, band);
+	for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) {
+		struct skb_array *q = band2list(priv, band);
 
-		return qh->head;
+		skb = __skb_array_peek(q);
 	}
 
-	return NULL;
+	return skb;
 }
 
 static void pfifo_fast_reset(struct Qdisc *qdisc)
 {
-	int prio;
+	int i, band;
 	struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
 
-	for (prio = 0; prio < PFIFO_FAST_BANDS; prio++)
-		__qdisc_reset_queue(band2list(priv, prio));
+	for (band = 0; band < PFIFO_FAST_BANDS; band++) {
+		struct skb_array *q = band2list(priv, band);
+		struct sk_buff *skb;
 
-	priv->bitmap = 0;
-	qdisc->qstats.backlog = 0;
-	qdisc->q.qlen = 0;
+		while ((skb = skb_array_consume_bh(q)) != NULL)
+			kfree_skb(skb);
+	}
+
+	for_each_possible_cpu(i) {
+		struct gnet_stats_queue *q = per_cpu_ptr(qdisc->cpu_qstats, i);
+
+		q->backlog = 0;
+		q->qlen = 0;
+	}
 }
 
 static int pfifo_fast_dump(struct Qdisc *qdisc, struct sk_buff *skb)
@@ -572,17 +686,48 @@ nla_put_failure:
 
 static int pfifo_fast_init(struct Qdisc *qdisc, struct nlattr *opt)
 {
-	int prio;
+	unsigned int qlen = qdisc_dev(qdisc)->tx_queue_len;
 	struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
+	int prio;
 
-	for (prio = 0; prio < PFIFO_FAST_BANDS; prio++)
-		qdisc_skb_head_init(band2list(priv, prio));
+	/* guard against zero length rings */
+	if (!qlen)
+		return -EINVAL;
+
+	for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) {
+		struct skb_array *q = band2list(priv, prio);
+		int err;
+
+		err = skb_array_init(q, qlen, GFP_KERNEL);
+		if (err)
+			return -ENOMEM;
+	}
 
 	/* Can by-pass the queue discipline */
 	qdisc->flags |= TCQ_F_CAN_BYPASS;
 	return 0;
 }
 
+static void pfifo_fast_destroy(struct Qdisc *sch)
+{
+	struct pfifo_fast_priv *priv = qdisc_priv(sch);
+	int prio;
+
+	for (prio = 0; prio < PFIFO_FAST_BANDS; prio++) {
+		struct skb_array *q = band2list(priv, prio);
+
+		/* NULL ring is possible if destroy path is due to a failed
+		 * skb_array_init() in pfifo_fast_init() case.
+		 */
+		if (!&q->ring.queue)
+			continue;
+		/* Destroy ring but no need to kfree_skb because a call to
+		 * pfifo_fast_reset() has already done that work.
+		 */
+		ptr_ring_cleanup(&q->ring, NULL);
+	}
+}
+
 struct Qdisc_ops pfifo_fast_ops __read_mostly = {
 	.id		=	"pfifo_fast",
 	.priv_size	=	sizeof(struct pfifo_fast_priv),
@@ -590,9 +735,11 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
 	.dequeue	=	pfifo_fast_dequeue,
 	.peek		=	pfifo_fast_peek,
 	.init		=	pfifo_fast_init,
+	.destroy	=	pfifo_fast_destroy,
 	.reset		=	pfifo_fast_reset,
 	.dump		=	pfifo_fast_dump,
 	.owner		=	THIS_MODULE,
+	.static_flags	=	TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
 };
 EXPORT_SYMBOL(pfifo_fast_ops);
 
@@ -630,9 +777,24 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 		sch = (struct Qdisc *) QDISC_ALIGN((unsigned long) p);
 		sch->padded = (char *) sch - (char *) p;
 	}
+	__skb_queue_head_init(&sch->gso_skb);
+	__skb_queue_head_init(&sch->skb_bad_txq);
 	qdisc_skb_head_init(&sch->q);
 	spin_lock_init(&sch->q.lock);
 
+	if (ops->static_flags & TCQ_F_CPUSTATS) {
+		sch->cpu_bstats =
+			netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu);
+		if (!sch->cpu_bstats)
+			goto errout1;
+
+		sch->cpu_qstats = alloc_percpu(struct gnet_stats_queue);
+		if (!sch->cpu_qstats) {
+			free_percpu(sch->cpu_bstats);
+			goto errout1;
+		}
+	}
+
 	spin_lock_init(&sch->busylock);
 	lockdep_set_class(&sch->busylock,
 			  dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);
@@ -642,6 +804,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 			  dev->qdisc_running_key ?: &qdisc_running_key);
 
 	sch->ops = ops;
+	sch->flags = ops->static_flags;
 	sch->enqueue = ops->enqueue;
 	sch->dequeue = ops->dequeue;
 	sch->dev_queue = dev_queue;
@@ -649,6 +812,8 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 	refcount_set(&sch->refcnt, 1);
 
 	return sch;
+errout1:
+	kfree(p);
 errout:
 	return ERR_PTR(err);
 }
@@ -682,17 +847,21 @@ EXPORT_SYMBOL(qdisc_create_dflt);
 void qdisc_reset(struct Qdisc *qdisc)
 {
 	const struct Qdisc_ops *ops = qdisc->ops;
+	struct sk_buff *skb, *tmp;
 
 	if (ops->reset)
 		ops->reset(qdisc);
 
-	kfree_skb(qdisc->skb_bad_txq);
-	qdisc->skb_bad_txq = NULL;
+	skb_queue_walk_safe(&qdisc->gso_skb, skb, tmp) {
+		__skb_unlink(skb, &qdisc->gso_skb);
+		kfree_skb_list(skb);
+	}
 
-	if (qdisc->gso_skb) {
-		kfree_skb_list(qdisc->gso_skb);
-		qdisc->gso_skb = NULL;
+	skb_queue_walk_safe(&qdisc->skb_bad_txq, skb, tmp) {
+		__skb_unlink(skb, &qdisc->skb_bad_txq);
+		kfree_skb_list(skb);
 	}
+
 	qdisc->q.qlen = 0;
 	qdisc->qstats.backlog = 0;
 }
@@ -711,6 +880,7 @@ static void qdisc_free(struct Qdisc *qdisc)
 void qdisc_destroy(struct Qdisc *qdisc)
 {
 	const struct Qdisc_ops  *ops = qdisc->ops;
+	struct sk_buff *skb, *tmp;
 
 	if (qdisc->flags & TCQ_F_BUILTIN ||
 	    !refcount_dec_and_test(&qdisc->refcnt))
@@ -730,8 +900,16 @@ void qdisc_destroy(struct Qdisc *qdisc)
 	module_put(ops->owner);
 	dev_put(qdisc_dev(qdisc));
 
-	kfree_skb_list(qdisc->gso_skb);
-	kfree_skb(qdisc->skb_bad_txq);
+	skb_queue_walk_safe(&qdisc->gso_skb, skb, tmp) {
+		__skb_unlink(skb, &qdisc->gso_skb);
+		kfree_skb_list(skb);
+	}
+
+	skb_queue_walk_safe(&qdisc->skb_bad_txq, skb, tmp) {
+		__skb_unlink(skb, &qdisc->skb_bad_txq);
+		kfree_skb_list(skb);
+	}
+
 	qdisc_free(qdisc);
 }
 EXPORT_SYMBOL(qdisc_destroy);
@@ -746,10 +924,6 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue,
 	root_lock = qdisc_lock(oqdisc);
 	spin_lock_bh(root_lock);
 
-	/* Prune old scheduler */
-	if (oqdisc && refcount_read(&oqdisc->refcnt) <= 1)
-		qdisc_reset(oqdisc);
-
 	/* ... and graft new one */
 	if (qdisc == NULL)
 		qdisc = &noop_qdisc;
@@ -885,14 +1059,18 @@ static bool some_qdisc_is_busy(struct net_device *dev)
 
 		dev_queue = netdev_get_tx_queue(dev, i);
 		q = dev_queue->qdisc_sleeping;
-		root_lock = qdisc_lock(q);
 
-		spin_lock_bh(root_lock);
+		if (q->flags & TCQ_F_NOLOCK) {
+			val = test_bit(__QDISC_STATE_SCHED, &q->state);
+		} else {
+			root_lock = qdisc_lock(q);
+			spin_lock_bh(root_lock);
 
-		val = (qdisc_is_running(q) ||
-		       test_bit(__QDISC_STATE_SCHED, &q->state));
+			val = (qdisc_is_running(q) ||
+			       test_bit(__QDISC_STATE_SCHED, &q->state));
 
-		spin_unlock_bh(root_lock);
+			spin_unlock_bh(root_lock);
+		}
 
 		if (val)
 			return true;
@@ -900,6 +1078,16 @@ static bool some_qdisc_is_busy(struct net_device *dev)
 	return false;
 }
 
+static void dev_qdisc_reset(struct net_device *dev,
+			    struct netdev_queue *dev_queue,
+			    void *none)
+{
+	struct Qdisc *qdisc = dev_queue->qdisc_sleeping;
+
+	if (qdisc)
+		qdisc_reset(qdisc);
+}
+
 /**
  * 	dev_deactivate_many - deactivate transmissions on several devices
  * 	@head: list of devices to deactivate
@@ -910,7 +1098,6 @@ static bool some_qdisc_is_busy(struct net_device *dev)
 void dev_deactivate_many(struct list_head *head)
 {
 	struct net_device *dev;
-	bool sync_needed = false;
 
 	list_for_each_entry(dev, head, close_list) {
 		netdev_for_each_tx_queue(dev, dev_deactivate_queue,
@@ -920,20 +1107,25 @@ void dev_deactivate_many(struct list_head *head)
 					     &noop_qdisc);
 
 		dev_watchdog_down(dev);
-		sync_needed |= !dev->dismantle;
 	}
 
 	/* Wait for outstanding qdisc-less dev_queue_xmit calls.
 	 * This is avoided if all devices are in dismantle phase :
 	 * Caller will call synchronize_net() for us
 	 */
-	if (sync_needed)
-		synchronize_net();
+	synchronize_net();
 
 	/* Wait for outstanding qdisc_run calls. */
-	list_for_each_entry(dev, head, close_list)
+	list_for_each_entry(dev, head, close_list) {
 		while (some_qdisc_is_busy(dev))
 			yield();
+		/* The new qdisc is assigned at this point so we can safely
+		 * unwind stale skb lists and qdisc statistics
+		 */
+		netdev_for_each_tx_queue(dev, dev_qdisc_reset, NULL);
+		if (dev_ingress_queue(dev))
+			dev_qdisc_reset(dev, dev_ingress_queue(dev), NULL);
+	}
 }
 
 void dev_deactivate(struct net_device *dev)
@@ -954,6 +1146,8 @@ static void dev_init_scheduler_queue(struct net_device *dev,
 
 	rcu_assign_pointer(dev_queue->qdisc, qdisc);
 	dev_queue->qdisc_sleeping = qdisc;
+	__skb_queue_head_init(&qdisc->gso_skb);
+	__skb_queue_head_init(&qdisc->skb_bad_txq);
 }
 
 void dev_init_scheduler(struct net_device *dev)
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index 213b586a06a0..8cbb5c829d59 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -17,6 +17,7 @@
 #include <linux/skbuff.h>
 #include <net/netlink.h>
 #include <net/pkt_sched.h>
+#include <net/sch_generic.h>
 
 struct mq_sched {
 	struct Qdisc		**qdiscs;
@@ -97,23 +98,42 @@ static int mq_dump(struct Qdisc *sch, struct sk_buff *skb)
 	struct net_device *dev = qdisc_dev(sch);
 	struct Qdisc *qdisc;
 	unsigned int ntx;
+	__u32 qlen = 0;
 
 	sch->q.qlen = 0;
 	memset(&sch->bstats, 0, sizeof(sch->bstats));
 	memset(&sch->qstats, 0, sizeof(sch->qstats));
 
+	/* MQ supports lockless qdiscs. However, statistics accounting needs
+	 * to account for all, none, or a mix of locked and unlocked child
+	 * qdiscs. Percpu stats are added to counters in-band and locking
+	 * qdisc totals are added at end.
+	 */
 	for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
 		qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping;
 		spin_lock_bh(qdisc_lock(qdisc));
-		sch->q.qlen		+= qdisc->q.qlen;
-		sch->bstats.bytes	+= qdisc->bstats.bytes;
-		sch->bstats.packets	+= qdisc->bstats.packets;
-		sch->qstats.backlog	+= qdisc->qstats.backlog;
-		sch->qstats.drops	+= qdisc->qstats.drops;
-		sch->qstats.requeues	+= qdisc->qstats.requeues;
-		sch->qstats.overlimits	+= qdisc->qstats.overlimits;
+
+		if (qdisc_is_percpu_stats(qdisc)) {
+			qlen = qdisc_qlen_sum(qdisc);
+			__gnet_stats_copy_basic(NULL, &sch->bstats,
+						qdisc->cpu_bstats,
+						&qdisc->bstats);
+			__gnet_stats_copy_queue(&sch->qstats,
+						qdisc->cpu_qstats,
+						&qdisc->qstats, qlen);
+		} else {
+			sch->q.qlen		+= qdisc->q.qlen;
+			sch->bstats.bytes	+= qdisc->bstats.bytes;
+			sch->bstats.packets	+= qdisc->bstats.packets;
+			sch->qstats.backlog	+= qdisc->qstats.backlog;
+			sch->qstats.drops	+= qdisc->qstats.drops;
+			sch->qstats.requeues	+= qdisc->qstats.requeues;
+			sch->qstats.overlimits	+= qdisc->qstats.overlimits;
+		}
+
 		spin_unlock_bh(qdisc_lock(qdisc));
 	}
+
 	return 0;
 }
 
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index b85885a9d8a1..8622745f3cd9 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -388,22 +388,40 @@ static int mqprio_dump(struct Qdisc *sch, struct sk_buff *skb)
 	struct nlattr *nla = (struct nlattr *)skb_tail_pointer(skb);
 	struct tc_mqprio_qopt opt = { 0 };
 	struct Qdisc *qdisc;
-	unsigned int i;
+	unsigned int ntx, tc;
 
 	sch->q.qlen = 0;
 	memset(&sch->bstats, 0, sizeof(sch->bstats));
 	memset(&sch->qstats, 0, sizeof(sch->qstats));
 
-	for (i = 0; i < dev->num_tx_queues; i++) {
-		qdisc = rtnl_dereference(netdev_get_tx_queue(dev, i)->qdisc);
+	/* MQ supports lockless qdiscs. However, statistics accounting needs
+	 * to account for all, none, or a mix of locked and unlocked child
+	 * qdiscs. Percpu stats are added to counters in-band and locking
+	 * qdisc totals are added at end.
+	 */
+	for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
+		qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping;
 		spin_lock_bh(qdisc_lock(qdisc));
-		sch->q.qlen		+= qdisc->q.qlen;
-		sch->bstats.bytes	+= qdisc->bstats.bytes;
-		sch->bstats.packets	+= qdisc->bstats.packets;
-		sch->qstats.backlog	+= qdisc->qstats.backlog;
-		sch->qstats.drops	+= qdisc->qstats.drops;
-		sch->qstats.requeues	+= qdisc->qstats.requeues;
-		sch->qstats.overlimits	+= qdisc->qstats.overlimits;
+
+		if (qdisc_is_percpu_stats(qdisc)) {
+			__u32 qlen = qdisc_qlen_sum(qdisc);
+
+			__gnet_stats_copy_basic(NULL, &sch->bstats,
+						qdisc->cpu_bstats,
+						&qdisc->bstats);
+			__gnet_stats_copy_queue(&sch->qstats,
+						qdisc->cpu_qstats,
+						&qdisc->qstats, qlen);
+		} else {
+			sch->q.qlen		+= qdisc->q.qlen;
+			sch->bstats.bytes	+= qdisc->bstats.bytes;
+			sch->bstats.packets	+= qdisc->bstats.packets;
+			sch->qstats.backlog	+= qdisc->qstats.backlog;
+			sch->qstats.drops	+= qdisc->qstats.drops;
+			sch->qstats.requeues	+= qdisc->qstats.requeues;
+			sch->qstats.overlimits	+= qdisc->qstats.overlimits;
+		}
+
 		spin_unlock_bh(qdisc_lock(qdisc));
 	}
 
@@ -411,9 +429,9 @@ static int mqprio_dump(struct Qdisc *sch, struct sk_buff *skb)
 	memcpy(opt.prio_tc_map, dev->prio_tc_map, sizeof(opt.prio_tc_map));
 	opt.hw = priv->hw_offload;
 
-	for (i = 0; i < netdev_get_num_tc(dev); i++) {
-		opt.count[i] = dev->tc_to_txq[i].count;
-		opt.offset[i] = dev->tc_to_txq[i].offset;
+	for (tc = 0; tc < netdev_get_num_tc(dev); tc++) {
+		opt.count[tc] = dev->tc_to_txq[tc].count;
+		opt.offset[tc] = dev->tc_to_txq[tc].offset;
 	}
 
 	if (nla_put(skb, TCA_OPTIONS, NLA_ALIGN(sizeof(opt)), &opt))
@@ -495,7 +513,6 @@ static int mqprio_dump_class_stats(struct Qdisc *sch, unsigned long cl,
 	if (cl >= TC_H_MIN_PRIORITY) {
 		int i;
 		__u32 qlen = 0;
-		struct Qdisc *qdisc;
 		struct gnet_stats_queue qstats = {0};
 		struct gnet_stats_basic_packed bstats = {0};
 		struct net_device *dev = qdisc_dev(sch);
@@ -511,18 +528,26 @@ static int mqprio_dump_class_stats(struct Qdisc *sch, unsigned long cl,
 
 		for (i = tc.offset; i < tc.offset + tc.count; i++) {
 			struct netdev_queue *q = netdev_get_tx_queue(dev, i);
+			struct Qdisc *qdisc = rtnl_dereference(q->qdisc);
+			struct gnet_stats_basic_cpu __percpu *cpu_bstats = NULL;
+			struct gnet_stats_queue __percpu *cpu_qstats = NULL;
 
-			qdisc = rtnl_dereference(q->qdisc);
 			spin_lock_bh(qdisc_lock(qdisc));
-			qlen		  += qdisc->q.qlen;
-			bstats.bytes      += qdisc->bstats.bytes;
-			bstats.packets    += qdisc->bstats.packets;
-			qstats.backlog    += qdisc->qstats.backlog;
-			qstats.drops      += qdisc->qstats.drops;
-			qstats.requeues   += qdisc->qstats.requeues;
-			qstats.overlimits += qdisc->qstats.overlimits;
+			if (qdisc_is_percpu_stats(qdisc)) {
+				cpu_bstats = qdisc->cpu_bstats;
+				cpu_qstats = qdisc->cpu_qstats;
+			}
+
+			qlen = qdisc_qlen_sum(qdisc);
+			__gnet_stats_copy_basic(NULL, &sch->bstats,
+						cpu_bstats, &qdisc->bstats);
+			__gnet_stats_copy_queue(&sch->qstats,
+						cpu_qstats,
+						&qdisc->qstats,
+						qlen);
 			spin_unlock_bh(qdisc_lock(qdisc));
 		}
+
 		/* Reclaim root sleeping lock before completing stats */
 		if (d->lock)
 			spin_lock_bh(d->lock);
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 6451c5013e06..daf8075f5a4c 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -520,7 +520,7 @@ decline_rdma:
 	smc->use_fallback = true;
 	if (reason_code && (reason_code != SMC_CLC_DECL_REPLY)) {
 		rc = smc_clc_send_decline(smc, reason_code);
-		if (rc < sizeof(struct smc_clc_msg_decline))
+		if (rc < 0)
 			goto out_err;
 	}
 	goto out_connected;
@@ -751,14 +751,16 @@ static void smc_listen_work(struct work_struct *work)
 {
 	struct smc_sock *new_smc = container_of(work, struct smc_sock,
 						smc_listen_work);
+	struct smc_clc_msg_proposal_prefix *pclc_prfx;
 	struct socket *newclcsock = new_smc->clcsock;
 	struct smc_sock *lsmc = new_smc->listen_smc;
 	struct smc_clc_msg_accept_confirm cclc;
 	int local_contact = SMC_REUSE_CONTACT;
 	struct sock *newsmcsk = &new_smc->sk;
-	struct smc_clc_msg_proposal pclc;
+	struct smc_clc_msg_proposal *pclc;
 	struct smc_ib_device *smcibdev;
 	struct sockaddr_in peeraddr;
+	u8 buf[SMC_CLC_MAX_LEN];
 	struct smc_link *link;
 	int reason_code = 0;
 	int rc = 0, len;
@@ -775,7 +777,7 @@ static void smc_listen_work(struct work_struct *work)
 	/* do inband token exchange -
 	 *wait for and receive SMC Proposal CLC message
 	 */
-	reason_code = smc_clc_wait_msg(new_smc, &pclc, sizeof(pclc),
+	reason_code = smc_clc_wait_msg(new_smc, &buf, sizeof(buf),
 				       SMC_CLC_PROPOSAL);
 	if (reason_code < 0)
 		goto out_err;
@@ -804,8 +806,11 @@ static void smc_listen_work(struct work_struct *work)
 		reason_code = SMC_CLC_DECL_CNFERR; /* configuration error */
 		goto decline_rdma;
 	}
-	if ((pclc.outgoing_subnet != subnet) ||
-	    (pclc.prefix_len != prefix_len)) {
+
+	pclc = (struct smc_clc_msg_proposal *)&buf;
+	pclc_prfx = smc_clc_proposal_get_prefix(pclc);
+	if (pclc_prfx->outgoing_subnet != subnet ||
+	    pclc_prfx->prefix_len != prefix_len) {
 		reason_code = SMC_CLC_DECL_CNFERR; /* configuration error */
 		goto decline_rdma;
 	}
@@ -816,7 +821,7 @@ static void smc_listen_work(struct work_struct *work)
 	/* allocate connection / link group */
 	mutex_lock(&smc_create_lgr_pending);
 	local_contact = smc_conn_create(new_smc, peeraddr.sin_addr.s_addr,
-					smcibdev, ibport, &pclc.lcl, 0);
+					smcibdev, ibport, &pclc->lcl, 0);
 	if (local_contact < 0) {
 		rc = local_contact;
 		if (rc == -ENOMEM)
@@ -879,11 +884,9 @@ static void smc_listen_work(struct work_struct *work)
 		}
 		/* QP confirmation over RoCE fabric */
 		reason_code = smc_serv_conf_first_link(new_smc);
-		if (reason_code < 0) {
+		if (reason_code < 0)
 			/* peer is not aware of a problem */
-			rc = reason_code;
 			goto out_err_unlock;
-		}
 		if (reason_code > 0)
 			goto decline_rdma_unlock;
 	}
@@ -916,8 +919,7 @@ decline_rdma:
 	smc_conn_free(&new_smc->conn);
 	new_smc->use_fallback = true;
 	if (reason_code && (reason_code != SMC_CLC_DECL_REPLY)) {
-		rc = smc_clc_send_decline(new_smc, reason_code);
-		if (rc < sizeof(struct smc_clc_msg_decline))
+		if (smc_clc_send_decline(new_smc, reason_code) < 0)
 			goto out_err;
 	}
 	goto out_connected;
diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index 87f7bede6eab..d4155ff6acde 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -213,6 +213,9 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc,
 		/* guarantee 0 <= bytes_to_rcv <= rmbe_size */
 		smp_mb__after_atomic();
 		smc->sk.sk_data_ready(&smc->sk);
+	} else if ((conn->local_rx_ctrl.prod_flags.write_blocked) ||
+		   (conn->local_rx_ctrl.prod_flags.cons_curs_upd_req)) {
+		smc->sk.sk_data_ready(&smc->sk);
 	}
 
 	if (conn->local_rx_ctrl.conn_state_flags.peer_conn_abort) {
@@ -234,15 +237,6 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc,
 		/* trigger socket release if connection closed */
 		smc_close_wake_tx_prepared(smc);
 	}
-
-	/* socket connected but not accepted */
-	if (!smc->sk.sk_socket)
-		return;
-
-	/* data available */
-	if ((conn->local_rx_ctrl.prod_flags.write_blocked) ||
-	    (conn->local_rx_ctrl.prod_flags.cons_curs_upd_req))
-		smc_tx_consumer_update(conn);
 }
 
 /* called under tasklet context */
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
index 1800e16b2a02..abf7ceb6690b 100644
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -22,6 +22,54 @@
 #include "smc_clc.h"
 #include "smc_ib.h"
 
+/* check if received message has a correct header length and contains valid
+ * heading and trailing eyecatchers
+ */
+static bool smc_clc_msg_hdr_valid(struct smc_clc_msg_hdr *clcm)
+{
+	struct smc_clc_msg_proposal_prefix *pclc_prfx;
+	struct smc_clc_msg_accept_confirm *clc;
+	struct smc_clc_msg_proposal *pclc;
+	struct smc_clc_msg_decline *dclc;
+	struct smc_clc_msg_trail *trl;
+
+	if (memcmp(clcm->eyecatcher, SMC_EYECATCHER, sizeof(SMC_EYECATCHER)))
+		return false;
+	switch (clcm->type) {
+	case SMC_CLC_PROPOSAL:
+		pclc = (struct smc_clc_msg_proposal *)clcm;
+		pclc_prfx = smc_clc_proposal_get_prefix(pclc);
+		if (ntohs(pclc->hdr.length) !=
+			sizeof(*pclc) + ntohs(pclc->iparea_offset) +
+			sizeof(*pclc_prfx) +
+			pclc_prfx->ipv6_prefixes_cnt *
+				sizeof(struct smc_clc_ipv6_prefix) +
+			sizeof(*trl))
+			return false;
+		trl = (struct smc_clc_msg_trail *)
+			((u8 *)pclc + ntohs(pclc->hdr.length) - sizeof(*trl));
+		break;
+	case SMC_CLC_ACCEPT:
+	case SMC_CLC_CONFIRM:
+		clc = (struct smc_clc_msg_accept_confirm *)clcm;
+		if (ntohs(clc->hdr.length) != sizeof(*clc))
+			return false;
+		trl = &clc->trl;
+		break;
+	case SMC_CLC_DECLINE:
+		dclc = (struct smc_clc_msg_decline *)clcm;
+		if (ntohs(dclc->hdr.length) != sizeof(*dclc))
+			return false;
+		trl = &dclc->trl;
+		break;
+	default:
+		return false;
+	}
+	if (memcmp(trl->eyecatcher, SMC_EYECATCHER, sizeof(SMC_EYECATCHER)))
+		return false;
+	return true;
+}
+
 /* Wait for data on the tcp-socket, analyze received data
  * Returns:
  * 0 if success and it was not a decline that we received.
@@ -72,9 +120,7 @@ int smc_clc_wait_msg(struct smc_sock *smc, void *buf, int buflen,
 	}
 	datlen = ntohs(clcm->length);
 	if ((len < sizeof(struct smc_clc_msg_hdr)) ||
-	    (datlen < sizeof(struct smc_clc_msg_decline)) ||
-	    (datlen > sizeof(struct smc_clc_msg_accept_confirm)) ||
-	    memcmp(clcm->eyecatcher, SMC_EYECATCHER, sizeof(SMC_EYECATCHER)) ||
+	    (datlen > buflen) ||
 	    ((clcm->type != SMC_CLC_DECLINE) &&
 	     (clcm->type != expected_type))) {
 		smc->sk.sk_err = EPROTO;
@@ -89,7 +135,7 @@ int smc_clc_wait_msg(struct smc_sock *smc, void *buf, int buflen,
 	krflags = MSG_WAITALL;
 	smc->clcsock->sk->sk_rcvtimeo = CLC_WAIT_TIME;
 	len = kernel_recvmsg(smc->clcsock, &msg, &vec, 1, datlen, krflags);
-	if (len < datlen) {
+	if (len < datlen || !smc_clc_msg_hdr_valid(clcm)) {
 		smc->sk.sk_err = EPROTO;
 		reason_code = -EPROTO;
 		goto out;
@@ -133,7 +179,7 @@ int smc_clc_send_decline(struct smc_sock *smc, u32 peer_diag_info)
 		smc->sk.sk_err = EPROTO;
 	if (len < 0)
 		smc->sk.sk_err = -len;
-	return len;
+	return sock_error(&smc->sk);
 }
 
 /* send CLC PROPOSAL message across internal TCP socket */
@@ -141,33 +187,43 @@ int smc_clc_send_proposal(struct smc_sock *smc,
 			  struct smc_ib_device *smcibdev,
 			  u8 ibport)
 {
+	struct smc_clc_msg_proposal_prefix pclc_prfx;
 	struct smc_clc_msg_proposal pclc;
+	struct smc_clc_msg_trail trl;
 	int reason_code = 0;
+	struct kvec vec[3];
 	struct msghdr msg;
-	struct kvec vec;
-	int len, rc;
+	int len, plen, rc;
 
 	/* send SMC Proposal CLC message */
+	plen = sizeof(pclc) + sizeof(pclc_prfx) + sizeof(trl);
 	memset(&pclc, 0, sizeof(pclc));
 	memcpy(pclc.hdr.eyecatcher, SMC_EYECATCHER, sizeof(SMC_EYECATCHER));
 	pclc.hdr.type = SMC_CLC_PROPOSAL;
-	pclc.hdr.length = htons(sizeof(pclc));
+	pclc.hdr.length = htons(plen);
 	pclc.hdr.version = SMC_CLC_V1;		/* SMC version */
 	memcpy(pclc.lcl.id_for_peer, local_systemid, sizeof(local_systemid));
 	memcpy(&pclc.lcl.gid, &smcibdev->gid[ibport - 1], SMC_GID_SIZE);
 	memcpy(&pclc.lcl.mac, &smcibdev->mac[ibport - 1], ETH_ALEN);
+	pclc.iparea_offset = htons(0);
 
+	memset(&pclc_prfx, 0, sizeof(pclc_prfx));
 	/* determine subnet and mask from internal TCP socket */
-	rc = smc_netinfo_by_tcpsk(smc->clcsock, &pclc.outgoing_subnet,
-				  &pclc.prefix_len);
+	rc = smc_netinfo_by_tcpsk(smc->clcsock, &pclc_prfx.outgoing_subnet,
+				  &pclc_prfx.prefix_len);
 	if (rc)
 		return SMC_CLC_DECL_CNFERR; /* configuration error */
-	memcpy(pclc.trl.eyecatcher, SMC_EYECATCHER, sizeof(SMC_EYECATCHER));
+	pclc_prfx.ipv6_prefixes_cnt = 0;
+	memcpy(trl.eyecatcher, SMC_EYECATCHER, sizeof(SMC_EYECATCHER));
 	memset(&msg, 0, sizeof(msg));
-	vec.iov_base = &pclc;
-	vec.iov_len = sizeof(pclc);
+	vec[0].iov_base = &pclc;
+	vec[0].iov_len = sizeof(pclc);
+	vec[1].iov_base = &pclc_prfx;
+	vec[1].iov_len = sizeof(pclc_prfx);
+	vec[2].iov_base = &trl;
+	vec[2].iov_len = sizeof(trl);
 	/* due to the few bytes needed for clc-handshake this cannot block */
-	len = kernel_sendmsg(smc->clcsock, &msg, &vec, 1, sizeof(pclc));
+	len = kernel_sendmsg(smc->clcsock, &msg, vec, 3, plen);
 	if (len < sizeof(pclc)) {
 		if (len >= 0) {
 			reason_code = -ENETUNREACH;
diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h
index 12a9af1539a2..c145a0f36a68 100644
--- a/net/smc/smc_clc.h
+++ b/net/smc/smc_clc.h
@@ -44,7 +44,7 @@ struct smc_clc_msg_hdr {	/* header1 of clc messages */
 #if defined(__BIG_ENDIAN_BITFIELD)
 	u8 version : 4,
 	   flag    : 1,
-	   rsvd	   : 3;
+	   rsvd    : 3;
 #elif defined(__LITTLE_ENDIAN_BITFIELD)
 	u8 rsvd    : 3,
 	   flag    : 1,
@@ -62,17 +62,31 @@ struct smc_clc_msg_local {	/* header2 of clc messages */
 	u8 mac[6];		/* mac of ib_device port */
 };
 
-struct smc_clc_msg_proposal {	/* clc proposal message */
-	struct smc_clc_msg_hdr hdr;
-	struct smc_clc_msg_local lcl;
-	__be16 iparea_offset;	/* offset to IP address information area */
+struct smc_clc_ipv6_prefix {
+	u8 prefix[4];
+	u8 prefix_len;
+} __packed;
+
+struct smc_clc_msg_proposal_prefix {	/* prefix part of clc proposal message*/
 	__be32 outgoing_subnet;	/* subnet mask */
 	u8 prefix_len;		/* number of significant bits in mask */
 	u8 reserved[2];
 	u8 ipv6_prefixes_cnt;	/* number of IPv6 prefixes in prefix array */
-	struct smc_clc_msg_trail trl; /* eye catcher "SMCR" EBCDIC */
 } __aligned(4);
 
+struct smc_clc_msg_proposal {	/* clc proposal message sent by Linux */
+	struct smc_clc_msg_hdr hdr;
+	struct smc_clc_msg_local lcl;
+	__be16 iparea_offset;	/* offset to IP address information area */
+} __aligned(4);
+
+#define SMC_CLC_PROPOSAL_MAX_OFFSET	0x28
+#define SMC_CLC_PROPOSAL_MAX_PREFIX	(8 * sizeof(struct smc_clc_ipv6_prefix))
+#define SMC_CLC_MAX_LEN		(sizeof(struct smc_clc_msg_proposal) + \
+				 SMC_CLC_PROPOSAL_MAX_OFFSET + \
+				 SMC_CLC_PROPOSAL_MAX_PREFIX + \
+				 sizeof(struct smc_clc_msg_trail))
+
 struct smc_clc_msg_accept_confirm {	/* clc accept / confirm message */
 	struct smc_clc_msg_hdr hdr;
 	struct smc_clc_msg_local lcl;
@@ -102,6 +116,14 @@ struct smc_clc_msg_decline {	/* clc decline message */
 	struct smc_clc_msg_trail trl; /* eye catcher "SMCR" EBCDIC */
 } __aligned(4);
 
+/* determine start of the prefix area within the proposal message */
+static inline struct smc_clc_msg_proposal_prefix *
+smc_clc_proposal_get_prefix(struct smc_clc_msg_proposal *pclc)
+{
+	return (struct smc_clc_msg_proposal_prefix *)
+	       ((u8 *)pclc + sizeof(*pclc) + ntohs(pclc->iparea_offset));
+}
+
 struct smc_sock;
 struct smc_ib_device;
 
diff --git a/net/smc/smc_close.c b/net/smc/smc_close.c
index 48615d2ac4aa..e194c6cc308a 100644
--- a/net/smc/smc_close.c
+++ b/net/smc/smc_close.c
@@ -113,7 +113,7 @@ static int smc_close_abort(struct smc_connection *conn)
 /* terminate smc socket abnormally - active abort
  * RDMA communication no longer possible
  */
-void smc_close_active_abort(struct smc_sock *smc)
+static void smc_close_active_abort(struct smc_sock *smc)
 {
 	struct smc_cdc_conn_state_flags *txflags =
 		&smc->conn.local_tx_ctrl.conn_state_flags;
diff --git a/net/smc/smc_close.h b/net/smc/smc_close.h
index ed82506b1b0a..8c498885d758 100644
--- a/net/smc/smc_close.h
+++ b/net/smc/smc_close.h
@@ -20,7 +20,6 @@
 #define SMC_CLOSE_SOCK_PUT_DELAY		HZ
 
 void smc_close_wake_tx_prepared(struct smc_sock *smc);
-void smc_close_active_abort(struct smc_sock *smc);
 int smc_close_active(struct smc_sock *smc);
 void smc_close_sock_put_work(struct work_struct *work);
 int smc_close_shutdown_write(struct smc_sock *smc);
diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
index cbf58637ee14..9dc392ca06bf 100644
--- a/net/smc/smc_rx.c
+++ b/net/smc/smc_rx.c
@@ -65,7 +65,6 @@ static int smc_rx_wait_data(struct smc_sock *smc, long *timeo)
 	rc = sk_wait_event(sk, timeo,
 			   sk->sk_err ||
 			   sk->sk_shutdown & RCV_SHUTDOWN ||
-			   sock_flag(sk, SOCK_DONE) ||
 			   atomic_read(&conn->bytes_to_rcv) ||
 			   smc_cdc_rxed_any_close_or_senddone(conn),
 			   &wait);
@@ -116,7 +115,7 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg, size_t len,
 		if (read_done) {
 			if (sk->sk_err ||
 			    sk->sk_state == SMC_CLOSED ||
-			    (sk->sk_shutdown & RCV_SHUTDOWN) ||
+			    sk->sk_shutdown & RCV_SHUTDOWN ||
 			    !timeo ||
 			    signal_pending(current) ||
 			    smc_cdc_rxed_any_close_or_senddone(conn) ||
@@ -124,8 +123,6 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg, size_t len,
 			    peer_conn_abort)
 				break;
 		} else {
-			if (sock_flag(sk, SOCK_DONE))
-				break;
 			if (sk->sk_err) {
 				read_done = sock_error(sk);
 				break;
diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index c48dc2d5fd3a..2e50fddf8ce9 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -104,14 +104,12 @@ static int smc_tx_wait_memory(struct smc_sock *smc, int flags)
 		if (atomic_read(&conn->sndbuf_space))
 			break; /* at least 1 byte of free space available */
 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
-		sk->sk_write_pending++;
 		sk_wait_event(sk, &timeo,
 			      sk->sk_err ||
 			      (sk->sk_shutdown & SEND_SHUTDOWN) ||
 			      smc_cdc_rxed_any_close_or_senddone(conn) ||
 			      atomic_read(&conn->sndbuf_space),
 			      &wait);
-		sk->sk_write_pending--;
 	}
 	remove_wait_queue(sk_sleep(sk), &wait);
 	return rc;
@@ -450,9 +448,7 @@ static void smc_tx_work(struct work_struct *work)
 void smc_tx_consumer_update(struct smc_connection *conn)
 {
 	union smc_host_cursor cfed, cons;
-	struct smc_cdc_tx_pend *pend;
-	struct smc_wr_buf *wr_buf;
-	int to_confirm, rc;
+	int to_confirm;
 
 	smc_curs_write(&cons,
 		       smc_curs_read(&conn->local_tx_ctrl.cons, conn),
@@ -466,10 +462,7 @@ void smc_tx_consumer_update(struct smc_connection *conn)
 	    ((to_confirm > conn->rmbe_update_limit) &&
 	     ((to_confirm > (conn->rmbe_size / 2)) ||
 	      conn->local_rx_ctrl.prod_flags.write_blocked))) {
-		rc = smc_cdc_get_free_slot(conn, &wr_buf, &pend);
-		if (!rc)
-			rc = smc_cdc_msg_send(conn, wr_buf, pend);
-		if (rc < 0) {
+		if (smc_cdc_get_slot_and_msg_send(conn) < 0) {
 			schedule_delayed_work(&conn->tx_work,
 					      SMC_TX_WORK_DELAY);
 			return;
diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 329325bd553e..37892b3909af 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -1,7 +1,7 @@
 /*
  * net/tipc/bcast.c: TIPC broadcast code
  *
- * Copyright (c) 2004-2006, 2014-2016, Ericsson AB
+ * Copyright (c) 2004-2006, 2014-2017, Ericsson AB
  * Copyright (c) 2004, Intel Corporation.
  * Copyright (c) 2005, 2010-2011, Wind River Systems
  * All rights reserved.
@@ -42,8 +42,8 @@
 #include "link.h"
 #include "name_table.h"
 
-#define	BCLINK_WIN_DEFAULT	50	/* bcast link window size (default) */
-#define	BCLINK_WIN_MIN	        32	/* bcast minimum link window size */
+#define BCLINK_WIN_DEFAULT  50	/* bcast link window size (default) */
+#define BCLINK_WIN_MIN      32	/* bcast minimum link window size */
 
 const char tipc_bclink_name[] = "broadcast-link";
 
@@ -74,6 +74,10 @@ static struct tipc_bc_base *tipc_bc_base(struct net *net)
 	return tipc_net(net)->bcbase;
 }
 
+/* tipc_bcast_get_mtu(): -get the MTU currently used by broadcast link
+ * Note: the MTU is decremented to give room for a tunnel header, in
+ * case the message needs to be sent as replicast
+ */
 int tipc_bcast_get_mtu(struct net *net)
 {
 	return tipc_link_mtu(tipc_bc_sndlink(net)) - INT_H_SIZE;
@@ -515,7 +519,7 @@ int tipc_bcast_init(struct net *net)
 	spin_lock_init(&tipc_net(net)->bclock);
 
 	if (!tipc_link_bc_create(net, 0, 0,
-				 U16_MAX,
+				 FB_MTU,
 				 BCLINK_WIN_DEFAULT,
 				 0,
 				 &bb->inputq,
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 6bce0b1117bd..2d6b2aed30e0 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -483,7 +483,7 @@ bool tipc_link_create(struct net *net, char *if_name, int bearer_id,
 /**
  * tipc_link_bc_create - create new link to be used for broadcast
  * @n: pointer to associated node
- * @mtu: mtu to be used
+ * @mtu: mtu to be used initially if no peers
  * @window: send window to be used
  * @inputq: queue to put messages ready for delivery
  * @namedq: queue to put binding table update messages ready for delivery
diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index b0d07b35909d..55d8ba92291d 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -251,20 +251,23 @@ bool tipc_msg_validate(struct sk_buff **_skb)
  * @pktmax: Max packet size that can be used
  * @list: Buffer or chain of buffers to be returned to caller
  *
+ * Note that the recursive call we are making here is safe, since it can
+ * logically go only one further level down.
+ *
  * Returns message data size or errno: -ENOMEM, -EFAULT
  */
-int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
-		   int offset, int dsz, int pktmax, struct sk_buff_head *list)
+int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m, int offset,
+		   int dsz, int pktmax, struct sk_buff_head *list)
 {
 	int mhsz = msg_hdr_sz(mhdr);
+	struct tipc_msg pkthdr;
 	int msz = mhsz + dsz;
-	int pktno = 1;
-	int pktsz;
 	int pktrem = pktmax;
-	int drem = dsz;
-	struct tipc_msg pkthdr;
 	struct sk_buff *skb;
+	int drem = dsz;
+	int pktno = 1;
 	char *pktpos;
+	int pktsz;
 	int rc;
 
 	msg_set_size(mhdr, msz);
@@ -272,8 +275,18 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
 	/* No fragmentation needed? */
 	if (likely(msz <= pktmax)) {
 		skb = tipc_buf_acquire(msz, GFP_KERNEL);
-		if (unlikely(!skb))
+
+		/* Fall back to smaller MTU if node local message */
+		if (unlikely(!skb)) {
+			if (pktmax != MAX_MSG_SIZE)
+				return -ENOMEM;
+			rc = tipc_msg_build(mhdr, m, offset, dsz, FB_MTU, list);
+			if (rc != dsz)
+				return rc;
+			if (tipc_msg_assemble(list))
+				return dsz;
 			return -ENOMEM;
+		}
 		skb_orphan(skb);
 		__skb_queue_tail(list, skb);
 		skb_copy_to_linear_data(skb, mhdr, mhsz);
@@ -589,6 +602,30 @@ bool tipc_msg_lookup_dest(struct net *net, struct sk_buff *skb, int *err)
 	return true;
 }
 
+/* tipc_msg_assemble() - assemble chain of fragments into one message
+ */
+bool tipc_msg_assemble(struct sk_buff_head *list)
+{
+	struct sk_buff *skb, *tmp = NULL;
+
+	if (skb_queue_len(list) == 1)
+		return true;
+
+	while ((skb = __skb_dequeue(list))) {
+		skb->next = NULL;
+		if (tipc_buf_append(&tmp, &skb)) {
+			__skb_queue_tail(list, skb);
+			return true;
+		}
+		if (!tmp)
+			break;
+	}
+	__skb_queue_purge(list);
+	__skb_queue_head_init(list);
+	pr_warn("Failed do assemble buffer\n");
+	return false;
+}
+
 /* tipc_msg_reassemble() - clone a buffer chain of fragments and
  *                         reassemble the clones into one message
  */
diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index 3e4384c222f7..b4ba1b4f9ae7 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -98,7 +98,7 @@ struct plist;
 #define MAX_H_SIZE                60	/* Largest possible TIPC header size */
 
 #define MAX_MSG_SIZE (MAX_H_SIZE + TIPC_MAX_USER_MSG_SIZE)
-
+#define FB_MTU                  3744
 #define TIPC_MEDIA_INFO_OFFSET	5
 
 struct tipc_skb_cb {
@@ -943,6 +943,7 @@ bool tipc_msg_extract(struct sk_buff *skb, struct sk_buff **iskb, int *pos);
 int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
 		   int offset, int dsz, int mtu, struct sk_buff_head *list);
 bool tipc_msg_lookup_dest(struct net *net, struct sk_buff *skb, int *err);
+bool tipc_msg_assemble(struct sk_buff_head *list);
 bool tipc_msg_reassemble(struct sk_buff_head *list, struct sk_buff_head *rcvq);
 bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg,
 			struct sk_buff_head *cpy);
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 30e5746085b8..c61a7d46b412 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -120,8 +120,8 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
 	if (!x->type_offload || x->encap)
 		return false;
 
-	if ((x->xso.offload_handle && (dev == dst->path->dev)) &&
-	     !dst->child->xfrm && x->type->get_mtu) {
+	if ((x->xso.offload_handle && (dev == xfrm_dst_path(dst)->dev)) &&
+	     !xdst->child->xfrm && x->type->get_mtu) {
 		mtu = x->type->get_mtu(x, xdst->child_mtu_cached);
 
 		if (skb->len <= mtu)
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 73ad8c8ef344..23468672a767 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -44,7 +44,7 @@ static int xfrm_skb_check_space(struct sk_buff *skb)
 
 static struct dst_entry *skb_dst_pop(struct sk_buff *skb)
 {
-	struct dst_entry *child = dst_clone(skb_dst(skb)->child);
+	struct dst_entry *child = dst_clone(xfrm_dst_child(skb_dst(skb)));
 
 	skb_dst_drop(skb);
 	return child;
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 9542975eb2f9..22e3350013b4 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -54,7 +54,7 @@ static struct xfrm_policy_afinfo const __rcu *xfrm_policy_afinfo[AF_INET6 + 1]
 static struct kmem_cache *xfrm_dst_cache __read_mostly;
 static __read_mostly seqcount_t xfrm_policy_hash_generation;
 
-static void xfrm_init_pmtu(struct dst_entry *dst);
+static void xfrm_init_pmtu(struct xfrm_dst **bundle, int nr);
 static int stale_bundle(struct dst_entry *dst);
 static int xfrm_bundle_ok(struct xfrm_dst *xdst);
 static void xfrm_policy_queue_process(struct timer_list *t);
@@ -1538,7 +1538,9 @@ static inline int xfrm_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
  */
 
 static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
-					    struct xfrm_state **xfrm, int nx,
+					    struct xfrm_state **xfrm,
+					    struct xfrm_dst **bundle,
+					    int nx,
 					    const struct flowi *fl,
 					    struct dst_entry *dst)
 {
@@ -1546,8 +1548,8 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
 	unsigned long now = jiffies;
 	struct net_device *dev;
 	struct xfrm_mode *inner_mode;
-	struct dst_entry *dst_prev = NULL;
-	struct dst_entry *dst0 = NULL;
+	struct xfrm_dst *xdst_prev = NULL;
+	struct xfrm_dst *xdst0 = NULL;
 	int i = 0;
 	int err;
 	int header_len = 0;
@@ -1573,13 +1575,14 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
 			goto put_states;
 		}
 
-		if (!dst_prev)
-			dst0 = dst1;
+		bundle[i] = xdst;
+		if (!xdst_prev)
+			xdst0 = xdst;
 		else
 			/* Ref count is taken during xfrm_alloc_dst()
 			 * No need to do dst_clone() on dst1
 			 */
-			dst_prev->child = dst1;
+			xfrm_dst_set_child(xdst_prev, &xdst->u.dst);
 
 		if (xfrm[i]->sel.family == AF_UNSPEC) {
 			inner_mode = xfrm_ip2inner_mode(xfrm[i],
@@ -1616,8 +1619,7 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
 		dst1->input = dst_discard;
 		dst1->output = inner_mode->afinfo->output;
 
-		dst1->next = dst_prev;
-		dst_prev = dst1;
+		xdst_prev = xdst;
 
 		header_len += xfrm[i]->props.header_len;
 		if (xfrm[i]->type->flags & XFRM_TYPE_NON_FRAGMENT)
@@ -1625,40 +1627,39 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
 		trailer_len += xfrm[i]->props.trailer_len;
 	}
 
-	dst_prev->child = dst;
-	dst0->path = dst;
+	xfrm_dst_set_child(xdst_prev, dst);
+	xdst0->path = dst;
 
 	err = -ENODEV;
 	dev = dst->dev;
 	if (!dev)
 		goto free_dst;
 
-	xfrm_init_path((struct xfrm_dst *)dst0, dst, nfheader_len);
-	xfrm_init_pmtu(dst_prev);
+	xfrm_init_path(xdst0, dst, nfheader_len);
+	xfrm_init_pmtu(bundle, nx);
 
-	for (dst_prev = dst0; dst_prev != dst; dst_prev = dst_prev->child) {
-		struct xfrm_dst *xdst = (struct xfrm_dst *)dst_prev;
-
-		err = xfrm_fill_dst(xdst, dev, fl);
+	for (xdst_prev = xdst0; xdst_prev != (struct xfrm_dst *)dst;
+	     xdst_prev = (struct xfrm_dst *) xfrm_dst_child(&xdst_prev->u.dst)) {
+		err = xfrm_fill_dst(xdst_prev, dev, fl);
 		if (err)
 			goto free_dst;
 
-		dst_prev->header_len = header_len;
-		dst_prev->trailer_len = trailer_len;
-		header_len -= xdst->u.dst.xfrm->props.header_len;
-		trailer_len -= xdst->u.dst.xfrm->props.trailer_len;
+		xdst_prev->u.dst.header_len = header_len;
+		xdst_prev->u.dst.trailer_len = trailer_len;
+		header_len -= xdst_prev->u.dst.xfrm->props.header_len;
+		trailer_len -= xdst_prev->u.dst.xfrm->props.trailer_len;
 	}
 
 out:
-	return dst0;
+	return &xdst0->u.dst;
 
 put_states:
 	for (; i < nx; i++)
 		xfrm_state_put(xfrm[i]);
 free_dst:
-	if (dst0)
-		dst_release_immediate(dst0);
-	dst0 = ERR_PTR(err);
+	if (xdst0)
+		dst_release_immediate(&xdst0->u.dst);
+	xdst0 = ERR_PTR(err);
 	goto out;
 }
 
@@ -1800,7 +1801,7 @@ static bool xfrm_xdst_can_reuse(struct xfrm_dst *xdst,
 	for (i = 0; i < num; i++) {
 		if (!dst || dst->xfrm != xfrm[i])
 			return false;
-		dst = dst->child;
+		dst = xfrm_dst_child(dst);
 	}
 
 	return xfrm_bundle_ok(xdst);
@@ -1813,6 +1814,7 @@ xfrm_resolve_and_create_bundle(struct xfrm_policy **pols, int num_pols,
 {
 	struct net *net = xp_net(pols[0]);
 	struct xfrm_state *xfrm[XFRM_MAX_DEPTH];
+	struct xfrm_dst *bundle[XFRM_MAX_DEPTH];
 	struct xfrm_dst *xdst, *old;
 	struct dst_entry *dst;
 	int err;
@@ -1840,7 +1842,7 @@ xfrm_resolve_and_create_bundle(struct xfrm_policy **pols, int num_pols,
 
 	old = xdst;
 
-	dst = xfrm_bundle_create(pols[0], xfrm, err, fl, dst_orig);
+	dst = xfrm_bundle_create(pols[0], xfrm, bundle, err, fl, dst_orig);
 	if (IS_ERR(dst)) {
 		XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTBUNDLEGENERROR);
 		return ERR_CAST(dst);
@@ -1880,8 +1882,8 @@ static void xfrm_policy_queue_process(struct timer_list *t)
 	xfrm_decode_session(skb, &fl, dst->ops->family);
 	spin_unlock(&pq->hold_queue.lock);
 
-	dst_hold(dst->path);
-	dst = xfrm_lookup(net, dst->path, &fl, sk, 0);
+	dst_hold(xfrm_dst_path(dst));
+	dst = xfrm_lookup(net, xfrm_dst_path(dst), &fl, sk, 0);
 	if (IS_ERR(dst))
 		goto purge_queue;
 
@@ -1910,8 +1912,8 @@ static void xfrm_policy_queue_process(struct timer_list *t)
 		skb = __skb_dequeue(&list);
 
 		xfrm_decode_session(skb, &fl, skb_dst(skb)->ops->family);
-		dst_hold(skb_dst(skb)->path);
-		dst = xfrm_lookup(net, skb_dst(skb)->path, &fl, skb->sk, 0);
+		dst_hold(xfrm_dst_path(skb_dst(skb)));
+		dst = xfrm_lookup(net, xfrm_dst_path(skb_dst(skb)), &fl, skb->sk, 0);
 		if (IS_ERR(dst)) {
 			kfree_skb(skb);
 			continue;
@@ -2012,8 +2014,8 @@ static struct xfrm_dst *xfrm_create_dummy_bundle(struct net *net,
 	dst1->output = xdst_queue_output;
 
 	dst_hold(dst);
-	dst1->child = dst;
-	dst1->path = dst;
+	xfrm_dst_set_child(xdst, dst);
+	xdst->path = dst;
 
 	xfrm_init_path((struct xfrm_dst *)dst1, dst, 0);
 
@@ -2576,7 +2578,7 @@ static int stale_bundle(struct dst_entry *dst)
 
 void xfrm_dst_ifdown(struct dst_entry *dst, struct net_device *dev)
 {
-	while ((dst = dst->child) && dst->xfrm && dst->dev == dev) {
+	while ((dst = xfrm_dst_child(dst)) && dst->xfrm && dst->dev == dev) {
 		dst->dev = dev_net(dev)->loopback_dev;
 		dev_hold(dst->dev);
 		dev_put(dev);
@@ -2600,13 +2602,15 @@ static struct dst_entry *xfrm_negative_advice(struct dst_entry *dst)
 	return dst;
 }
 
-static void xfrm_init_pmtu(struct dst_entry *dst)
+static void xfrm_init_pmtu(struct xfrm_dst **bundle, int nr)
 {
-	do {
-		struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
+	while (nr--) {
+		struct xfrm_dst *xdst = bundle[nr];
 		u32 pmtu, route_mtu_cached;
+		struct dst_entry *dst;
 
-		pmtu = dst_mtu(dst->child);
+		dst = &xdst->u.dst;
+		pmtu = dst_mtu(xfrm_dst_child(dst));
 		xdst->child_mtu_cached = pmtu;
 
 		pmtu = xfrm_state_mtu(dst->xfrm, pmtu);
@@ -2618,7 +2622,7 @@ static void xfrm_init_pmtu(struct dst_entry *dst)
 			pmtu = route_mtu_cached;
 
 		dst_metric_set(dst, RTAX_MTU, pmtu);
-	} while ((dst = dst->next));
+	}
 }
 
 /* Check that the bundle accepts the flow and its components are
@@ -2627,19 +2631,20 @@ static void xfrm_init_pmtu(struct dst_entry *dst)
 
 static int xfrm_bundle_ok(struct xfrm_dst *first)
 {
+	struct xfrm_dst *bundle[XFRM_MAX_DEPTH];
 	struct dst_entry *dst = &first->u.dst;
-	struct xfrm_dst *last;
+	struct xfrm_dst *xdst;
+	int start_from, nr;
 	u32 mtu;
 
-	if (!dst_check(dst->path, ((struct xfrm_dst *)dst)->path_cookie) ||
+	if (!dst_check(xfrm_dst_path(dst), ((struct xfrm_dst *)dst)->path_cookie) ||
 	    (dst->dev && !netif_running(dst->dev)))
 		return 0;
 
 	if (dst->flags & DST_XFRM_QUEUE)
 		return 1;
 
-	last = NULL;
-
+	start_from = nr = 0;
 	do {
 		struct xfrm_dst *xdst = (struct xfrm_dst *)dst;
 
@@ -2651,9 +2656,11 @@ static int xfrm_bundle_ok(struct xfrm_dst *first)
 		    xdst->policy_genid != atomic_read(&xdst->pols[0]->genid))
 			return 0;
 
-		mtu = dst_mtu(dst->child);
+		bundle[nr++] = xdst;
+
+		mtu = dst_mtu(xfrm_dst_child(dst));
 		if (xdst->child_mtu_cached != mtu) {
-			last = xdst;
+			start_from = nr;
 			xdst->child_mtu_cached = mtu;
 		}
 
@@ -2661,30 +2668,30 @@ static int xfrm_bundle_ok(struct xfrm_dst *first)
 			return 0;
 		mtu = dst_mtu(xdst->route);
 		if (xdst->route_mtu_cached != mtu) {
-			last = xdst;
+			start_from = nr;
 			xdst->route_mtu_cached = mtu;
 		}
 
-		dst = dst->child;
+		dst = xfrm_dst_child(dst);
 	} while (dst->xfrm);
 
-	if (likely(!last))
+	if (likely(!start_from))
 		return 1;
 
-	mtu = last->child_mtu_cached;
-	for (;;) {
-		dst = &last->u.dst;
+	xdst = bundle[start_from - 1];
+	mtu = xdst->child_mtu_cached;
+	while (start_from--) {
+		dst = &xdst->u.dst;
 
 		mtu = xfrm_state_mtu(dst->xfrm, mtu);
-		if (mtu > last->route_mtu_cached)
-			mtu = last->route_mtu_cached;
+		if (mtu > xdst->route_mtu_cached)
+			mtu = xdst->route_mtu_cached;
 		dst_metric_set(dst, RTAX_MTU, mtu);
-
-		if (last == first)
+		if (!start_from)
 			break;
 
-		last = (struct xfrm_dst *)last->u.dst.next;
-		last->child_mtu_cached = mtu;
+		xdst = bundle[start_from - 1];
+		xdst->child_mtu_cached = mtu;
 	}
 
 	return 1;
@@ -2692,22 +2699,20 @@ static int xfrm_bundle_ok(struct xfrm_dst *first)
 
 static unsigned int xfrm_default_advmss(const struct dst_entry *dst)
 {
-	return dst_metric_advmss(dst->path);
+	return dst_metric_advmss(xfrm_dst_path(dst));
 }
 
 static unsigned int xfrm_mtu(const struct dst_entry *dst)
 {
 	unsigned int mtu = dst_metric_raw(dst, RTAX_MTU);
 
-	return mtu ? : dst_mtu(dst->path);
+	return mtu ? : dst_mtu(xfrm_dst_path(dst));
 }
 
 static const void *xfrm_get_dst_nexthop(const struct dst_entry *dst,
 					const void *daddr)
 {
-	const struct dst_entry *path = dst->path;
-
-	for (; dst != path; dst = dst->child) {
+	while (dst->xfrm) {
 		const struct xfrm_state *xfrm = dst->xfrm;
 
 		if (xfrm->props.mode == XFRM_MODE_TRANSPORT)
@@ -2716,6 +2721,8 @@ static const void *xfrm_get_dst_nexthop(const struct dst_entry *dst,
 			daddr = xfrm->coaddr;
 		else if (!(xfrm->type->flags & XFRM_TYPE_LOCAL_COADDR))
 			daddr = &xfrm->id.daddr;
+
+		dst = xfrm_dst_child(dst);
 	}
 	return daddr;
 }
@@ -2724,7 +2731,7 @@ static struct neighbour *xfrm_neigh_lookup(const struct dst_entry *dst,
 					   struct sk_buff *skb,
 					   const void *daddr)
 {
-	const struct dst_entry *path = dst->path;
+	const struct dst_entry *path = xfrm_dst_path(dst);
 
 	if (!skb)
 		daddr = xfrm_get_dst_nexthop(dst, daddr);
@@ -2733,7 +2740,7 @@ static struct neighbour *xfrm_neigh_lookup(const struct dst_entry *dst,
 
 static void xfrm_confirm_neigh(const struct dst_entry *dst, const void *daddr)
 {
-	const struct dst_entry *path = dst->path;
+	const struct dst_entry *path = xfrm_dst_path(dst);
 
 	daddr = xfrm_get_dst_nexthop(dst, daddr);
 	path->ops->confirm_neigh(path, daddr);
diff --git a/samples/bpf/tcbpf2_kern.c b/samples/bpf/tcbpf2_kern.c
index 370b749f5ee6..79ad061079dd 100644
--- a/samples/bpf/tcbpf2_kern.c
+++ b/samples/bpf/tcbpf2_kern.c
@@ -81,6 +81,49 @@ int _gre_get_tunnel(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
+SEC("ip6gretap_set_tunnel")
+int _ip6gretap_set_tunnel(struct __sk_buff *skb)
+{
+	struct bpf_tunnel_key key;
+	int ret;
+
+	__builtin_memset(&key, 0x0, sizeof(key));
+	key.remote_ipv6[3] = _htonl(0x11); /* ::11 */
+	key.tunnel_id = 2;
+	key.tunnel_tos = 0;
+	key.tunnel_ttl = 64;
+	key.tunnel_label = 0xabcde;
+
+	ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key),
+				     BPF_F_TUNINFO_IPV6 | BPF_F_ZERO_CSUM_TX);
+	if (ret < 0) {
+		ERROR(ret);
+		return TC_ACT_SHOT;
+	}
+
+	return TC_ACT_OK;
+}
+
+SEC("ip6gretap_get_tunnel")
+int _ip6gretap_get_tunnel(struct __sk_buff *skb)
+{
+	char fmt[] = "key %d remote ip6 ::%x label %x\n";
+	struct bpf_tunnel_key key;
+	int ret;
+
+	ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key),
+				     BPF_F_TUNINFO_IPV6);
+	if (ret < 0) {
+		ERROR(ret);
+		return TC_ACT_SHOT;
+	}
+
+	bpf_trace_printk(fmt, sizeof(fmt),
+			 key.tunnel_id, key.remote_ipv6[3], key.tunnel_label);
+
+	return TC_ACT_OK;
+}
+
 SEC("erspan_set_tunnel")
 int _erspan_set_tunnel(struct __sk_buff *skb)
 {
@@ -138,6 +181,64 @@ int _erspan_get_tunnel(struct __sk_buff *skb)
 	return TC_ACT_OK;
 }
 
+SEC("ip4ip6erspan_set_tunnel")
+int _ip4ip6erspan_set_tunnel(struct __sk_buff *skb)
+{
+	struct bpf_tunnel_key key;
+	struct erspan_metadata md;
+	int ret;
+
+	__builtin_memset(&key, 0x0, sizeof(key));
+	key.remote_ipv6[3] = _htonl(0x11);
+	key.tunnel_id = 2;
+	key.tunnel_tos = 0;
+	key.tunnel_ttl = 64;
+
+	ret = bpf_skb_set_tunnel_key(skb, &key, sizeof(key),
+				     BPF_F_TUNINFO_IPV6);
+	if (ret < 0) {
+		ERROR(ret);
+		return TC_ACT_SHOT;
+	}
+
+	md.index = htonl(123);
+	ret = bpf_skb_set_tunnel_opt(skb, &md, sizeof(md));
+	if (ret < 0) {
+		ERROR(ret);
+		return TC_ACT_SHOT;
+	}
+
+	return TC_ACT_OK;
+}
+
+SEC("ip4ip6erspan_get_tunnel")
+int _ip4ip6erspan_get_tunnel(struct __sk_buff *skb)
+{
+	char fmt[] = "key %d remote ip6 ::%x erspan index 0x%x\n";
+	struct bpf_tunnel_key key;
+	struct erspan_metadata md;
+	u32 index;
+	int ret;
+
+	ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), BPF_F_TUNINFO_IPV6);
+	if (ret < 0) {
+		ERROR(ret);
+		return TC_ACT_SHOT;
+	}
+
+	ret = bpf_skb_get_tunnel_opt(skb, &md, sizeof(md));
+	if (ret < 0) {
+		ERROR(ret);
+		return TC_ACT_SHOT;
+	}
+
+	index = bpf_ntohl(md.index);
+	bpf_trace_printk(fmt, sizeof(fmt),
+			key.tunnel_id, key.remote_ipv6[0], index);
+
+	return TC_ACT_OK;
+}
+
 SEC("vxlan_set_tunnel")
 int _vxlan_set_tunnel(struct __sk_buff *skb)
 {
diff --git a/samples/bpf/test_cgrp2_attach2.c b/samples/bpf/test_cgrp2_attach2.c
index 3e8232cc04a8..1af412ec6007 100644
--- a/samples/bpf/test_cgrp2_attach2.c
+++ b/samples/bpf/test_cgrp2_attach2.c
@@ -78,7 +78,8 @@ static int test_foo_bar(void)
 	if (join_cgroup(FOO))
 		goto err;
 
-	if (bpf_prog_attach(drop_prog, foo, BPF_CGROUP_INET_EGRESS, 1)) {
+	if (bpf_prog_attach(drop_prog, foo, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_OVERRIDE)) {
 		log_err("Attaching prog to /foo");
 		goto err;
 	}
@@ -97,7 +98,8 @@ static int test_foo_bar(void)
 	printf("Attached DROP prog. This ping in cgroup /foo/bar should fail...\n");
 	assert(system(PING_CMD) != 0);
 
-	if (bpf_prog_attach(allow_prog, bar, BPF_CGROUP_INET_EGRESS, 1)) {
+	if (bpf_prog_attach(allow_prog, bar, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_OVERRIDE)) {
 		log_err("Attaching prog to /foo/bar");
 		goto err;
 	}
@@ -114,7 +116,8 @@ static int test_foo_bar(void)
 	       "This ping in cgroup /foo/bar should fail...\n");
 	assert(system(PING_CMD) != 0);
 
-	if (bpf_prog_attach(allow_prog, bar, BPF_CGROUP_INET_EGRESS, 1)) {
+	if (bpf_prog_attach(allow_prog, bar, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_OVERRIDE)) {
 		log_err("Attaching prog to /foo/bar");
 		goto err;
 	}
@@ -128,7 +131,8 @@ static int test_foo_bar(void)
 	       "This ping in cgroup /foo/bar should pass...\n");
 	assert(system(PING_CMD) == 0);
 
-	if (bpf_prog_attach(allow_prog, bar, BPF_CGROUP_INET_EGRESS, 1)) {
+	if (bpf_prog_attach(allow_prog, bar, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_OVERRIDE)) {
 		log_err("Attaching prog to /foo/bar");
 		goto err;
 	}
@@ -161,13 +165,15 @@ static int test_foo_bar(void)
 		goto err;
 	}
 
-	if (!bpf_prog_attach(allow_prog, bar, BPF_CGROUP_INET_EGRESS, 1)) {
+	if (!bpf_prog_attach(allow_prog, bar, BPF_CGROUP_INET_EGRESS,
+			     BPF_F_ALLOW_OVERRIDE)) {
 		errno = 0;
 		log_err("Unexpected success attaching overridable prog to /foo/bar");
 		goto err;
 	}
 
-	if (!bpf_prog_attach(allow_prog, foo, BPF_CGROUP_INET_EGRESS, 1)) {
+	if (!bpf_prog_attach(allow_prog, foo, BPF_CGROUP_INET_EGRESS,
+			     BPF_F_ALLOW_OVERRIDE)) {
 		errno = 0;
 		log_err("Unexpected success attaching overridable prog to /foo");
 		goto err;
@@ -273,27 +279,33 @@ static int test_multiprog(void)
 	if (join_cgroup("/cg1/cg2/cg3/cg4/cg5"))
 		goto err;
 
-	if (bpf_prog_attach(allow_prog[0], cg1, BPF_CGROUP_INET_EGRESS, 2)) {
+	if (bpf_prog_attach(allow_prog[0], cg1, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_MULTI)) {
 		log_err("Attaching prog to cg1");
 		goto err;
 	}
-	if (!bpf_prog_attach(allow_prog[0], cg1, BPF_CGROUP_INET_EGRESS, 2)) {
+	if (!bpf_prog_attach(allow_prog[0], cg1, BPF_CGROUP_INET_EGRESS,
+			     BPF_F_ALLOW_MULTI)) {
 		log_err("Unexpected success attaching the same prog to cg1");
 		goto err;
 	}
-	if (bpf_prog_attach(allow_prog[1], cg1, BPF_CGROUP_INET_EGRESS, 2)) {
+	if (bpf_prog_attach(allow_prog[1], cg1, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_MULTI)) {
 		log_err("Attaching prog2 to cg1");
 		goto err;
 	}
-	if (bpf_prog_attach(allow_prog[2], cg2, BPF_CGROUP_INET_EGRESS, 1)) {
+	if (bpf_prog_attach(allow_prog[2], cg2, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_OVERRIDE)) {
 		log_err("Attaching prog to cg2");
 		goto err;
 	}
-	if (bpf_prog_attach(allow_prog[3], cg3, BPF_CGROUP_INET_EGRESS, 2)) {
+	if (bpf_prog_attach(allow_prog[3], cg3, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_MULTI)) {
 		log_err("Attaching prog to cg3");
 		goto err;
 	}
-	if (bpf_prog_attach(allow_prog[4], cg4, BPF_CGROUP_INET_EGRESS, 1)) {
+	if (bpf_prog_attach(allow_prog[4], cg4, BPF_CGROUP_INET_EGRESS,
+			    BPF_F_ALLOW_OVERRIDE)) {
 		log_err("Attaching prog to cg4");
 		goto err;
 	}
diff --git a/samples/bpf/test_tunnel_bpf.sh b/samples/bpf/test_tunnel_bpf.sh
index 312e1722a39f..f53efb62f699 100755
--- a/samples/bpf/test_tunnel_bpf.sh
+++ b/samples/bpf/test_tunnel_bpf.sh
@@ -33,6 +33,30 @@ function add_gre_tunnel {
 	ip addr add dev $DEV 10.1.1.200/24
 }
 
+function add_ip6gretap_tunnel {
+
+	# assign ipv6 address
+	ip netns exec at_ns0 ip addr add ::11/96 dev veth0
+	ip netns exec at_ns0 ip link set dev veth0 up
+	ip addr add dev veth1 ::22/96
+	ip link set dev veth1 up
+
+	# in namespace
+	ip netns exec at_ns0 \
+		ip link add dev $DEV_NS type $TYPE flowlabel 0xbcdef key 2 \
+		local ::11 remote ::22
+
+	ip netns exec at_ns0 ip addr add dev $DEV_NS 10.1.1.100/24
+	ip netns exec at_ns0 ip addr add dev $DEV_NS fc80::100/96
+	ip netns exec at_ns0 ip link set dev $DEV_NS up
+
+	# out of namespace
+	ip link add dev $DEV type $TYPE external
+	ip addr add dev $DEV 10.1.1.200/24
+	ip addr add dev $DEV fc80::200/24
+	ip link set dev $DEV up
+}
+
 function add_erspan_tunnel {
 	# in namespace
 	ip netns exec at_ns0 \
@@ -46,6 +70,28 @@ function add_erspan_tunnel {
 	ip addr add dev $DEV 10.1.1.200/24
 }
 
+function add_ip6erspan_tunnel {
+
+	# assign ipv6 address
+	ip netns exec at_ns0 ip addr add ::11/96 dev veth0
+	ip netns exec at_ns0 ip link set dev veth0 up
+	ip addr add dev veth1 ::22/96
+	ip link set dev veth1 up
+
+	# in namespace
+	ip netns exec at_ns0 \
+		ip link add dev $DEV_NS type $TYPE seq key 2 erspan 123 \
+		local ::11 remote ::22
+
+	ip netns exec at_ns0 ip addr add dev $DEV_NS 10.1.1.100/24
+	ip netns exec at_ns0 ip link set dev $DEV_NS up
+
+	# out of namespace
+	ip link add dev $DEV type $TYPE external
+	ip addr add dev $DEV 10.1.1.200/24
+	ip link set dev $DEV up
+}
+
 function add_vxlan_tunnel {
 	# Set static ARP entry here because iptables set-mark works
 	# on L3 packet, as a result not applying to ARP packets,
@@ -113,6 +159,41 @@ function test_gre {
 	cleanup
 }
 
+function test_ip6gre {
+	TYPE=ip6gre
+	DEV_NS=ip6gre00
+	DEV=ip6gre11
+	config_device
+	# reuse the ip6gretap function
+	add_ip6gretap_tunnel
+	attach_bpf $DEV ip6gretap_set_tunnel ip6gretap_get_tunnel
+	# underlay
+	ping6 -c 4 ::11
+	# overlay: ipv4 over ipv6
+	ip netns exec at_ns0 ping -c 1 10.1.1.200
+	ping -c 1 10.1.1.100
+	# overlay: ipv6 over ipv6
+	ip netns exec at_ns0 ping6 -c 1 fc80::200
+	cleanup
+}
+
+function test_ip6gretap {
+	TYPE=ip6gretap
+	DEV_NS=ip6gretap00
+	DEV=ip6gretap11
+	config_device
+	add_ip6gretap_tunnel
+	attach_bpf $DEV ip6gretap_set_tunnel ip6gretap_get_tunnel
+	# underlay
+	ping6 -c 4 ::11
+	# overlay: ipv4 over ipv6
+	ip netns exec at_ns0 ping -i .2 -c 1 10.1.1.200
+	ping -c 1 10.1.1.100
+	# overlay: ipv6 over ipv6
+	ip netns exec at_ns0 ping6 -c 1 fc80::200
+	cleanup
+}
+
 function test_erspan {
 	TYPE=erspan
 	DEV_NS=erspan00
@@ -125,6 +206,18 @@ function test_erspan {
 	cleanup
 }
 
+function test_ip6erspan {
+	TYPE=ip6erspan
+	DEV_NS=ip6erspan00
+	DEV=ip6erspan11
+	config_device
+	add_ip6erspan_tunnel
+	attach_bpf $DEV ip4ip6erspan_set_tunnel ip4ip6erspan_get_tunnel
+	ping6 -c 3 ::11
+	ip netns exec at_ns0 ping -c 1 10.1.1.200
+	cleanup
+}
+
 function test_vxlan {
 	TYPE=vxlan
 	DEV_NS=vxlan00
@@ -175,9 +268,12 @@ function cleanup {
 	ip link del veth1
 	ip link del ipip11
 	ip link del gretap11
+	ip link del ip6gre11
+	ip link del ip6gretap11
 	ip link del vxlan11
 	ip link del geneve11
 	ip link del erspan11
+	ip link del ip6erspan11
 	pkill tcpdump
 	pkill cat
 	set -ex
@@ -187,8 +283,14 @@ trap cleanup 0 2 3 6 9
 cleanup
 echo "Testing GRE tunnel..."
 test_gre
+echo "Testing IP6GRE tunnel..."
+test_ip6gre
+echo "Testing IP6GRETAP tunnel..."
+test_ip6gretap
 echo "Testing ERSPAN tunnel..."
 test_erspan
+echo "Testing IP6ERSPAN tunnel..."
+test_ip6erspan
 echo "Testing VXLAN tunnel..."
 test_vxlan
 echo "Testing GENEVE tunnel..."
diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c
index 56e354fcdfc6..928188902901 100644
--- a/security/selinux/xfrm.c
+++ b/security/selinux/xfrm.c
@@ -452,7 +452,7 @@ int selinux_xfrm_postroute_last(u32 sk_sid, struct sk_buff *skb,
 	if (dst) {
 		struct dst_entry *iter;
 
-		for (iter = dst; iter != NULL; iter = iter->child) {
+		for (iter = dst; iter != NULL; iter = xfrm_dst_child(iter)) {
 			struct xfrm_state *x = iter->xfrm;
 
 			if (x && selinux_authorizable_xfrm(x))
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 21a2d76b67dc..f309ab9afd9b 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -29,9 +29,10 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test
 
 TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test_obj_id.o \
 	test_pkt_md_access.o test_xdp_redirect.o test_xdp_meta.o sockmap_parse_prog.o     \
-	sockmap_verdict_prog.o dev_cgroup.o
+	sockmap_verdict_prog.o dev_cgroup.o sample_ret0.o
 
-TEST_PROGS := test_kmod.sh test_xdp_redirect.sh test_xdp_meta.sh
+TEST_PROGS := test_kmod.sh test_xdp_redirect.sh test_xdp_meta.sh \
+	test_offload.py
 
 include ../lib.mk
 
diff --git a/tools/testing/selftests/bpf/sample_ret0.c b/tools/testing/selftests/bpf/sample_ret0.c
new file mode 100644
index 000000000000..fec99750d6ea
--- /dev/null
+++ b/tools/testing/selftests/bpf/sample_ret0.c
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) */
+
+/* Sample program which should always load for testing control paths. */
+int func()
+{
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/test_align.c b/tools/testing/selftests/bpf/test_align.c
index 8591c89c0828..fe916d29e166 100644
--- a/tools/testing/selftests/bpf/test_align.c
+++ b/tools/testing/selftests/bpf/test_align.c
@@ -64,11 +64,11 @@ static struct bpf_align_test tests[] = {
 		.matches = {
 			{1, "R1=ctx(id=0,off=0,imm=0)"},
 			{1, "R10=fp0"},
-			{1, "R3=inv2"},
-			{2, "R3=inv4"},
-			{3, "R3=inv8"},
-			{4, "R3=inv16"},
-			{5, "R3=inv32"},
+			{1, "R3_w=inv2"},
+			{2, "R3_w=inv4"},
+			{3, "R3_w=inv8"},
+			{4, "R3_w=inv16"},
+			{5, "R3_w=inv32"},
 		},
 	},
 	{
@@ -92,17 +92,17 @@ static struct bpf_align_test tests[] = {
 		.matches = {
 			{1, "R1=ctx(id=0,off=0,imm=0)"},
 			{1, "R10=fp0"},
-			{1, "R3=inv1"},
-			{2, "R3=inv2"},
-			{3, "R3=inv4"},
-			{4, "R3=inv8"},
-			{5, "R3=inv16"},
-			{6, "R3=inv1"},
-			{7, "R4=inv32"},
-			{8, "R4=inv16"},
-			{9, "R4=inv8"},
-			{10, "R4=inv4"},
-			{11, "R4=inv2"},
+			{1, "R3_w=inv1"},
+			{2, "R3_w=inv2"},
+			{3, "R3_w=inv4"},
+			{4, "R3_w=inv8"},
+			{5, "R3_w=inv16"},
+			{6, "R3_w=inv1"},
+			{7, "R4_w=inv32"},
+			{8, "R4_w=inv16"},
+			{9, "R4_w=inv8"},
+			{10, "R4_w=inv4"},
+			{11, "R4_w=inv2"},
 		},
 	},
 	{
@@ -121,12 +121,12 @@ static struct bpf_align_test tests[] = {
 		.matches = {
 			{1, "R1=ctx(id=0,off=0,imm=0)"},
 			{1, "R10=fp0"},
-			{1, "R3=inv4"},
-			{2, "R3=inv8"},
-			{3, "R3=inv10"},
-			{4, "R4=inv8"},
-			{5, "R4=inv12"},
-			{6, "R4=inv14"},
+			{1, "R3_w=inv4"},
+			{2, "R3_w=inv8"},
+			{3, "R3_w=inv10"},
+			{4, "R4_w=inv8"},
+			{5, "R4_w=inv12"},
+			{6, "R4_w=inv14"},
 		},
 	},
 	{
@@ -143,10 +143,10 @@ static struct bpf_align_test tests[] = {
 		.matches = {
 			{1, "R1=ctx(id=0,off=0,imm=0)"},
 			{1, "R10=fp0"},
-			{1, "R3=inv7"},
-			{2, "R3=inv7"},
-			{3, "R3=inv14"},
-			{4, "R3=inv56"},
+			{1, "R3_w=inv7"},
+			{2, "R3_w=inv7"},
+			{3, "R3_w=inv14"},
+			{4, "R3_w=inv56"},
 		},
 	},
 
@@ -185,18 +185,18 @@ static struct bpf_align_test tests[] = {
 		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
 		.matches = {
 			{7, "R0=pkt(id=0,off=8,r=8,imm=0)"},
-			{7, "R3=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{8, "R3=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"},
-			{9, "R3=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
-			{10, "R3=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"},
-			{11, "R3=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"},
+			{7, "R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{8, "R3_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"},
+			{9, "R3_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{10, "R3_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"},
+			{11, "R3_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"},
 			{18, "R3=pkt_end(id=0,off=0,imm=0)"},
-			{18, "R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{19, "R4=inv(id=0,umax_value=8160,var_off=(0x0; 0x1fe0))"},
-			{20, "R4=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"},
-			{21, "R4=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"},
-			{22, "R4=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
-			{23, "R4=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"},
+			{18, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{19, "R4_w=inv(id=0,umax_value=8160,var_off=(0x0; 0x1fe0))"},
+			{20, "R4_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"},
+			{21, "R4_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"},
+			{22, "R4_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{23, "R4_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"},
 		},
 	},
 	{
@@ -217,16 +217,16 @@ static struct bpf_align_test tests[] = {
 		},
 		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
 		.matches = {
-			{7, "R3=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{8, "R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{9, "R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{10, "R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{11, "R4=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"},
-			{12, "R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{13, "R4=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
-			{14, "R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{15, "R4=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"},
-			{16, "R4=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"},
+			{7, "R3_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{8, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{9, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{10, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{11, "R4_w=inv(id=0,umax_value=510,var_off=(0x0; 0x1fe))"},
+			{12, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{13, "R4_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{14, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{15, "R4_w=inv(id=0,umax_value=2040,var_off=(0x0; 0x7f8))"},
+			{16, "R4_w=inv(id=0,umax_value=4080,var_off=(0x0; 0xff0))"},
 		},
 	},
 	{
@@ -257,14 +257,14 @@ static struct bpf_align_test tests[] = {
 		},
 		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
 		.matches = {
-			{4, "R5=pkt(id=0,off=0,r=0,imm=0)"},
-			{5, "R5=pkt(id=0,off=14,r=0,imm=0)"},
-			{6, "R4=pkt(id=0,off=14,r=0,imm=0)"},
+			{4, "R5_w=pkt(id=0,off=0,r=0,imm=0)"},
+			{5, "R5_w=pkt(id=0,off=14,r=0,imm=0)"},
+			{6, "R4_w=pkt(id=0,off=14,r=0,imm=0)"},
 			{10, "R2=pkt(id=0,off=0,r=18,imm=0)"},
 			{10, "R5=pkt(id=0,off=14,r=18,imm=0)"},
-			{10, "R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
-			{14, "R4=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff))"},
-			{15, "R4=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff))"},
+			{10, "R4_w=inv(id=0,umax_value=255,var_off=(0x0; 0xff))"},
+			{14, "R4_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff))"},
+			{15, "R4_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff))"},
 		},
 	},
 	{
@@ -320,11 +320,11 @@ static struct bpf_align_test tests[] = {
 			 * alignment of 4.
 			 */
 			{8, "R2=pkt(id=0,off=0,r=8,imm=0)"},
-			{8, "R6=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{8, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* Offset is added to packet pointer R5, resulting in
 			 * known fixed offset, and variable offset from R6.
 			 */
-			{11, "R5=pkt(id=1,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{11, "R5_w=pkt(id=1,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* At the time the word size load is performed from R5,
 			 * it's total offset is NET_IP_ALIGN + reg->off (0) +
 			 * reg->aux_off (14) which is 16.  Then the variable
@@ -336,11 +336,11 @@ static struct bpf_align_test tests[] = {
 			/* Variable offset is added to R5 packet pointer,
 			 * resulting in auxiliary alignment of 4.
 			 */
-			{18, "R5=pkt(id=2,off=0,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{18, "R5_w=pkt(id=2,off=0,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* Constant offset is added to R5, resulting in
 			 * reg->off of 14.
 			 */
-			{19, "R5=pkt(id=2,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{19, "R5_w=pkt(id=2,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* At the time the word size load is performed from R5,
 			 * its total fixed offset is NET_IP_ALIGN + reg->off
 			 * (14) which is 16.  Then the variable offset is 4-byte
@@ -352,18 +352,18 @@ static struct bpf_align_test tests[] = {
 			/* Constant offset is added to R5 packet pointer,
 			 * resulting in reg->off value of 14.
 			 */
-			{26, "R5=pkt(id=0,off=14,r=8"},
+			{26, "R5_w=pkt(id=0,off=14,r=8"},
 			/* Variable offset is added to R5, resulting in a
 			 * variable offset of (4n).
 			 */
-			{27, "R5=pkt(id=3,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{27, "R5_w=pkt(id=3,off=14,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* Constant is added to R5 again, setting reg->off to 18. */
-			{28, "R5=pkt(id=3,off=18,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{28, "R5_w=pkt(id=3,off=18,r=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* And once more we add a variable; resulting var_off
 			 * is still (4n), fixed offset is not changed.
 			 * Also, we create a new reg->id.
 			 */
-			{29, "R5=pkt(id=4,off=18,r=0,umax_value=2040,var_off=(0x0; 0x7fc))"},
+			{29, "R5_w=pkt(id=4,off=18,r=0,umax_value=2040,var_off=(0x0; 0x7fc))"},
 			/* At the time the word size load is performed from R5,
 			 * its total fixed offset is NET_IP_ALIGN + reg->off (18)
 			 * which is 20.  Then the variable offset is (4n), so
@@ -410,11 +410,11 @@ static struct bpf_align_test tests[] = {
 			 * alignment of 4.
 			 */
 			{8, "R2=pkt(id=0,off=0,r=8,imm=0)"},
-			{8, "R6=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{8, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* Adding 14 makes R6 be (4n+2) */
-			{9, "R6=inv(id=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"},
+			{9, "R6_w=inv(id=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"},
 			/* Packet pointer has (4n+2) offset */
-			{11, "R5=pkt(id=1,off=0,r=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"},
+			{11, "R5_w=pkt(id=1,off=0,r=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"},
 			{13, "R4=pkt(id=1,off=4,r=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"},
 			/* At the time the word size load is performed from R5,
 			 * its total fixed offset is NET_IP_ALIGN + reg->off (0)
@@ -426,11 +426,11 @@ static struct bpf_align_test tests[] = {
 			/* Newly read value in R6 was shifted left by 2, so has
 			 * known alignment of 4.
 			 */
-			{18, "R6=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{18, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* Added (4n) to packet pointer's (4n+2) var_off, giving
 			 * another (4n+2).
 			 */
-			{19, "R5=pkt(id=2,off=0,r=0,umin_value=14,umax_value=2054,var_off=(0x2; 0xffc))"},
+			{19, "R5_w=pkt(id=2,off=0,r=0,umin_value=14,umax_value=2054,var_off=(0x2; 0xffc))"},
 			{21, "R4=pkt(id=2,off=4,r=0,umin_value=14,umax_value=2054,var_off=(0x2; 0xffc))"},
 			/* At the time the word size load is performed from R5,
 			 * its total fixed offset is NET_IP_ALIGN + reg->off (0)
@@ -473,11 +473,11 @@ static struct bpf_align_test tests[] = {
 		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
 		.result = REJECT,
 		.matches = {
-			{4, "R5=pkt(id=0,off=0,r=0,imm=0)"},
+			{4, "R5_w=pkt(id=0,off=0,r=0,imm=0)"},
 			/* ptr & 0x40 == either 0 or 0x40 */
-			{5, "R5=inv(id=0,umax_value=64,var_off=(0x0; 0x40))"},
+			{5, "R5_w=inv(id=0,umax_value=64,var_off=(0x0; 0x40))"},
 			/* ptr << 2 == unknown, (4n) */
-			{7, "R5=inv(id=0,smax_value=9223372036854775804,umax_value=18446744073709551612,var_off=(0x0; 0xfffffffffffffffc))"},
+			{7, "R5_w=inv(id=0,smax_value=9223372036854775804,umax_value=18446744073709551612,var_off=(0x0; 0xfffffffffffffffc))"},
 			/* (4n) + 14 == (4n+2).  We blow our bounds, because
 			 * the add could overflow.
 			 */
@@ -485,7 +485,7 @@ static struct bpf_align_test tests[] = {
 			/* Checked s>=0 */
 			{10, "R5=inv(id=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc))"},
 			/* packet pointer + nonnegative (4n+2) */
-			{12, "R6=pkt(id=1,off=0,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc))"},
+			{12, "R6_w=pkt(id=1,off=0,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc))"},
 			{14, "R4=pkt(id=1,off=4,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc))"},
 			/* NET_IP_ALIGN + (4n+2) == (4n), alignment is fine.
 			 * We checked the bounds, but it might have been able
@@ -530,11 +530,11 @@ static struct bpf_align_test tests[] = {
 			 * alignment of 4.
 			 */
 			{7, "R2=pkt(id=0,off=0,r=8,imm=0)"},
-			{9, "R6=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{9, "R6_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* Adding 14 makes R6 be (4n+2) */
-			{10, "R6=inv(id=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"},
+			{10, "R6_w=inv(id=0,umin_value=14,umax_value=1034,var_off=(0x2; 0x7fc))"},
 			/* New unknown value in R7 is (4n) */
-			{11, "R7=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
+			{11, "R7_w=inv(id=0,umax_value=1020,var_off=(0x0; 0x3fc))"},
 			/* Subtracting it from R6 blows our unsigned bounds */
 			{12, "R6=inv(id=0,smin_value=-1006,smax_value=1034,var_off=(0x2; 0xfffffffffffffffc))"},
 			/* Checked s>= 0 */
@@ -583,15 +583,15 @@ static struct bpf_align_test tests[] = {
 			 * alignment of 4.
 			 */
 			{7, "R2=pkt(id=0,off=0,r=8,imm=0)"},
-			{10, "R6=inv(id=0,umax_value=60,var_off=(0x0; 0x3c))"},
+			{10, "R6_w=inv(id=0,umax_value=60,var_off=(0x0; 0x3c))"},
 			/* Adding 14 makes R6 be (4n+2) */
-			{11, "R6=inv(id=0,umin_value=14,umax_value=74,var_off=(0x2; 0x7c))"},
+			{11, "R6_w=inv(id=0,umin_value=14,umax_value=74,var_off=(0x2; 0x7c))"},
 			/* Subtracting from packet pointer overflows ubounds */
-			{13, "R5=pkt(id=1,off=0,r=8,umin_value=18446744073709551542,umax_value=18446744073709551602,var_off=(0xffffffffffffff82; 0x7c))"},
+			{13, "R5_w=pkt(id=1,off=0,r=8,umin_value=18446744073709551542,umax_value=18446744073709551602,var_off=(0xffffffffffffff82; 0x7c))"},
 			/* New unknown value in R7 is (4n), >= 76 */
-			{15, "R7=inv(id=0,umin_value=76,umax_value=1096,var_off=(0x0; 0x7fc))"},
+			{15, "R7_w=inv(id=0,umin_value=76,umax_value=1096,var_off=(0x0; 0x7fc))"},
 			/* Adding it to packet pointer gives nice bounds again */
-			{16, "R5=pkt(id=2,off=0,r=0,umin_value=2,umax_value=1082,var_off=(0x2; 0x7fc))"},
+			{16, "R5_w=pkt(id=2,off=0,r=0,umin_value=2,umax_value=1082,var_off=(0x2; 0x7fc))"},
 			/* At the time the word size load is performed from R5,
 			 * its total fixed offset is NET_IP_ALIGN + reg->off (0)
 			 * which is 2.  Then the variable offset is (4n+2), so
diff --git a/tools/testing/selftests/bpf/test_offload.py b/tools/testing/selftests/bpf/test_offload.py
new file mode 100755
index 000000000000..3914f7a4585a
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_offload.py
@@ -0,0 +1,681 @@
+#!/usr/bin/python3
+
+# Copyright (C) 2017 Netronome Systems, Inc.
+#
+# This software is licensed under the GNU General License Version 2,
+# June 1991 as shown in the file COPYING in the top-level directory of this
+# source tree.
+#
+# THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+# WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+# BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+# FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE
+# OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME
+# THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+from datetime import datetime
+import argparse
+import json
+import os
+import pprint
+import subprocess
+import time
+
+logfile = None
+log_level = 1
+bpf_test_dir = os.path.dirname(os.path.realpath(__file__))
+pp = pprint.PrettyPrinter()
+devs = [] # devices we created for clean up
+files = [] # files to be removed
+
+def log_get_sec(level=0):
+    return "*" * (log_level + level)
+
+def log_level_inc(add=1):
+    global log_level
+    log_level += add
+
+def log_level_dec(sub=1):
+    global log_level
+    log_level -= sub
+
+def log_level_set(level):
+    global log_level
+    log_level = level
+
+def log(header, data, level=None):
+    """
+    Output to an optional log.
+    """
+    if logfile is None:
+        return
+    if level is not None:
+        log_level_set(level)
+
+    if not isinstance(data, str):
+        data = pp.pformat(data)
+
+    if len(header):
+        logfile.write("\n" + log_get_sec() + " ")
+        logfile.write(header)
+    if len(header) and len(data.strip()):
+        logfile.write("\n")
+    logfile.write(data)
+
+def skip(cond, msg):
+    if not cond:
+        return
+    print("SKIP: " + msg)
+    log("SKIP: " + msg, "", level=1)
+    os.sys.exit(0)
+
+def fail(cond, msg):
+    if not cond:
+        return
+    print("FAIL: " + msg)
+    log("FAIL: " + msg, "", level=1)
+    os.sys.exit(1)
+
+def start_test(msg):
+    log(msg, "", level=1)
+    log_level_inc()
+    print(msg)
+
+def cmd(cmd, shell=True, include_stderr=False, background=False, fail=True):
+    """
+    Run a command in subprocess and return tuple of (retval, stdout);
+    optionally return stderr as well as third value.
+    """
+    proc = subprocess.Popen(cmd, shell=shell, stdout=subprocess.PIPE,
+                            stderr=subprocess.PIPE)
+    if background:
+        msg = "%s START: %s" % (log_get_sec(1),
+                                datetime.now().strftime("%H:%M:%S.%f"))
+        log("BKG " + proc.args, msg)
+        return proc
+
+    return cmd_result(proc, include_stderr=include_stderr, fail=fail)
+
+def cmd_result(proc, include_stderr=False, fail=False):
+    stdout, stderr = proc.communicate()
+    stdout = stdout.decode("utf-8")
+    stderr = stderr.decode("utf-8")
+    proc.stdout.close()
+    proc.stderr.close()
+
+    stderr = "\n" + stderr
+    if stderr[-1] == "\n":
+        stderr = stderr[:-1]
+
+    sec = log_get_sec(1)
+    log("CMD " + proc.args,
+        "RETCODE: %d\n%s STDOUT:\n%s%s STDERR:%s\n%s END: %s" %
+        (proc.returncode, sec, stdout, sec, stderr,
+         sec, datetime.now().strftime("%H:%M:%S.%f")))
+
+    if proc.returncode != 0 and fail:
+        if len(stderr) > 0 and stderr[-1] == "\n":
+            stderr = stderr[:-1]
+        raise Exception("Command failed: %s\n%s" % (proc.args, stderr))
+
+    if include_stderr:
+        return proc.returncode, stdout, stderr
+    else:
+        return proc.returncode, stdout
+
+def rm(f):
+    cmd("rm -f %s" % (f))
+    if f in files:
+        files.remove(f)
+
+def tool(name, args, flags, JSON=True, fail=True):
+    params = ""
+    if JSON:
+        params += "%s " % (flags["json"])
+
+    ret, out = cmd(name + " " + params + args, fail=fail)
+    if JSON and len(out.strip()) != 0:
+        return ret, json.loads(out)
+    else:
+        return ret, out
+
+def bpftool(args, JSON=True, fail=True):
+    return tool("bpftool", args, {"json":"-p"}, JSON=JSON, fail=fail)
+
+def bpftool_prog_list(expected=None):
+    _, progs = bpftool("prog show", JSON=True, fail=True)
+    if expected is not None:
+        if len(progs) != expected:
+            fail(True, "%d BPF programs loaded, expected %d" %
+                 (len(progs), expected))
+    return progs
+
+def bpftool_prog_list_wait(expected=0, n_retry=20):
+    for i in range(n_retry):
+        nprogs = len(bpftool_prog_list())
+        if nprogs == expected:
+            return
+        time.sleep(0.05)
+    raise Exception("Time out waiting for program counts to stabilize want %d, have %d" % (expected, nprogs))
+
+def ip(args, force=False, JSON=True, fail=True):
+    if force:
+        args = "-force " + args
+    return tool("ip", args, {"json":"-j"}, JSON=JSON, fail=fail)
+
+def tc(args, JSON=True, fail=True):
+    return tool("tc", args, {"json":"-p"}, JSON=JSON, fail=fail)
+
+def ethtool(dev, opt, args, fail=True):
+    return cmd("ethtool %s %s %s" % (opt, dev["ifname"], args), fail=fail)
+
+def bpf_obj(name, sec=".text", path=bpf_test_dir,):
+    return "obj %s sec %s" % (os.path.join(path, name), sec)
+
+def bpf_pinned(name):
+    return "pinned %s" % (name)
+
+def bpf_bytecode(bytecode):
+    return "bytecode \"%s\"" % (bytecode)
+
+class DebugfsDir:
+    """
+    Class for accessing DebugFS directories as a dictionary.
+    """
+
+    def __init__(self, path):
+        self.path = path
+        self._dict = self._debugfs_dir_read(path)
+
+    def __len__(self):
+        return len(self._dict.keys())
+
+    def __getitem__(self, key):
+        if type(key) is int:
+            key = list(self._dict.keys())[key]
+        return self._dict[key]
+
+    def __setitem__(self, key, value):
+        log("DebugFS set %s = %s" % (key, value), "")
+        log_level_inc()
+
+        cmd("echo '%s' > %s/%s" % (value, self.path, key))
+        log_level_dec()
+
+        _, out = cmd('cat %s/%s' % (self.path, key))
+        self._dict[key] = out.strip()
+
+    def _debugfs_dir_read(self, path):
+        dfs = {}
+
+        log("DebugFS state for %s" % (path), "")
+        log_level_inc(add=2)
+
+        _, out = cmd('ls ' + path)
+        for f in out.split():
+            p = os.path.join(path, f)
+            if os.path.isfile(p):
+                _, out = cmd('cat %s/%s' % (path, f))
+                dfs[f] = out.strip()
+            elif os.path.isdir(p):
+                dfs[f] = DebugfsDir(p)
+            else:
+                raise Exception("%s is neither file nor directory" % (p))
+
+        log_level_dec()
+        log("DebugFS state", dfs)
+        log_level_dec()
+
+        return dfs
+
+class NetdevSim:
+    """
+    Class for netdevsim netdevice and its attributes.
+    """
+
+    def __init__(self):
+        self.dev = self._netdevsim_create()
+        devs.append(self)
+
+        self.dfs_dir = '/sys/kernel/debug/netdevsim/%s' % (self.dev['ifname'])
+        self.dfs_refresh()
+
+    def __getitem__(self, key):
+        return self.dev[key]
+
+    def _netdevsim_create(self):
+        _, old  = ip("link show")
+        ip("link add sim%d type netdevsim")
+        _, new  = ip("link show")
+
+        for dev in new:
+            f = filter(lambda x: x["ifname"] == dev["ifname"], old)
+            if len(list(f)) == 0:
+                return dev
+
+        raise Exception("failed to create netdevsim device")
+
+    def remove(self):
+        devs.remove(self)
+        ip("link del dev %s" % (self.dev["ifname"]))
+
+    def dfs_refresh(self):
+        self.dfs = DebugfsDir(self.dfs_dir)
+        return self.dfs
+
+    def dfs_num_bound_progs(self):
+        path = os.path.join(self.dfs_dir, "bpf_bound_progs")
+        _, progs = cmd('ls %s' % (path))
+        return len(progs.split())
+
+    def dfs_get_bound_progs(self, expected):
+        progs = DebugfsDir(os.path.join(self.dfs_dir, "bpf_bound_progs"))
+        if expected is not None:
+            if len(progs) != expected:
+                fail(True, "%d BPF programs bound, expected %d" %
+                     (len(progs), expected))
+        return progs
+
+    def wait_for_flush(self, bound=0, total=0, n_retry=20):
+        for i in range(n_retry):
+            nbound = self.dfs_num_bound_progs()
+            nprogs = len(bpftool_prog_list())
+            if nbound == bound and nprogs == total:
+                return
+            time.sleep(0.05)
+        raise Exception("Time out waiting for program counts to stabilize want %d/%d, have %d bound, %d loaded" % (bound, total, nbound, nprogs))
+
+    def set_mtu(self, mtu, fail=True):
+        return ip("link set dev %s mtu %d" % (self.dev["ifname"], mtu),
+                  fail=fail)
+
+    def set_xdp(self, bpf, mode, force=False, fail=True):
+        return ip("link set dev %s xdp%s %s" % (self.dev["ifname"], mode, bpf),
+                  force=force, fail=fail)
+
+    def unset_xdp(self, mode, force=False, fail=True):
+        return ip("link set dev %s xdp%s off" % (self.dev["ifname"], mode),
+                  force=force, fail=fail)
+
+    def ip_link_show(self, xdp):
+        _, link = ip("link show dev %s" % (self['ifname']))
+        if len(link) > 1:
+            raise Exception("Multiple objects on ip link show")
+        if len(link) < 1:
+            return {}
+        fail(xdp != "xdp" in link,
+             "XDP program not reporting in iplink (reported %s, expected %s)" %
+             ("xdp" in link, xdp))
+        return link[0]
+
+    def tc_add_ingress(self):
+        tc("qdisc add dev %s ingress" % (self['ifname']))
+
+    def tc_del_ingress(self):
+        tc("qdisc del dev %s ingress" % (self['ifname']))
+
+    def tc_flush_filters(self, bound=0, total=0):
+        self.tc_del_ingress()
+        self.tc_add_ingress()
+        self.wait_for_flush(bound=bound, total=total)
+
+    def tc_show_ingress(self, expected=None):
+        # No JSON support, oh well...
+        flags = ["skip_sw", "skip_hw", "in_hw"]
+        named = ["protocol", "pref", "chain", "handle", "id", "tag"]
+
+        args = "-s filter show dev %s ingress" % (self['ifname'])
+        _, out = tc(args, JSON=False)
+
+        filters = []
+        lines = out.split('\n')
+        for line in lines:
+            words = line.split()
+            if "handle" not in words:
+                continue
+            fltr = {}
+            for flag in flags:
+                fltr[flag] = flag in words
+            for name in named:
+                try:
+                    idx = words.index(name)
+                    fltr[name] = words[idx + 1]
+                except ValueError:
+                    pass
+            filters.append(fltr)
+
+        if expected is not None:
+            fail(len(filters) != expected,
+                 "%d ingress filters loaded, expected %d" %
+                 (len(filters), expected))
+        return filters
+
+    def cls_bpf_add_filter(self, bpf, da=False, skip_sw=False, skip_hw=False,
+                           fail=True):
+        params = ""
+        if da:
+            params += " da"
+        if skip_sw:
+            params += " skip_sw"
+        if skip_hw:
+            params += " skip_hw"
+        return tc("filter add dev %s ingress bpf %s %s" %
+                  (self['ifname'], bpf, params), fail=fail)
+
+    def set_ethtool_tc_offloads(self, enable, fail=True):
+        args = "hw-tc-offload %s" % ("on" if enable else "off")
+        return ethtool(self, "-K", args, fail=fail)
+
+################################################################################
+def clean_up():
+    for dev in devs:
+        dev.remove()
+    for f in files:
+        cmd("rm -f %s" % (f))
+
+def pin_prog(file_name, idx=0):
+    progs = bpftool_prog_list(expected=(idx + 1))
+    prog = progs[idx]
+    bpftool("prog pin id %d %s" % (prog["id"], file_name))
+    files.append(file_name)
+
+    return file_name, bpf_pinned(file_name)
+
+# Parse command line
+parser = argparse.ArgumentParser()
+parser.add_argument("--log", help="output verbose log to given file")
+args = parser.parse_args()
+if args.log:
+    logfile = open(args.log, 'w+')
+    logfile.write("# -*-Org-*-")
+
+log("Prepare...", "", level=1)
+log_level_inc()
+
+# Check permissions
+skip(os.getuid() != 0, "test must be run as root")
+
+# Check tools
+ret, progs = bpftool("prog", fail=False)
+skip(ret != 0, "bpftool not installed")
+# Check no BPF programs are loaded
+skip(len(progs) != 0, "BPF programs already loaded on the system")
+
+# Check netdevsim
+ret, out = cmd("modprobe netdevsim", fail=False)
+skip(ret != 0, "netdevsim module could not be loaded")
+
+# Check debugfs
+_, out = cmd("mount")
+if out.find("/sys/kernel/debug type debugfs") == -1:
+    cmd("mount -t debugfs none /sys/kernel/debug")
+
+# Check samples are compiled
+samples = ["sample_ret0.o"]
+for s in samples:
+    ret, out = cmd("ls %s/%s" % (bpf_test_dir, s), fail=False)
+    skip(ret != 0, "sample %s/%s not found, please compile it" %
+         (bpf_test_dir, s))
+
+try:
+    obj = bpf_obj("sample_ret0.o")
+    bytecode = bpf_bytecode("1,6 0 0 4294967295,")
+
+    start_test("Test destruction of generic XDP...")
+    sim = NetdevSim()
+    sim.set_xdp(obj, "generic")
+    sim.remove()
+    bpftool_prog_list_wait(expected=0)
+
+    sim = NetdevSim()
+    sim.tc_add_ingress()
+
+    start_test("Test TC non-offloaded...")
+    ret, _ = sim.cls_bpf_add_filter(obj, skip_hw=True, fail=False)
+    fail(ret != 0, "Software TC filter did not load")
+
+    start_test("Test TC non-offloaded isn't getting bound...")
+    ret, _ = sim.cls_bpf_add_filter(obj, fail=False)
+    fail(ret != 0, "Software TC filter did not load")
+    sim.dfs_get_bound_progs(expected=0)
+
+    sim.tc_flush_filters()
+
+    start_test("Test TC offloads are off by default...")
+    ret, _ = sim.cls_bpf_add_filter(obj, skip_sw=True, fail=False)
+    fail(ret == 0, "TC filter loaded without enabling TC offloads")
+    sim.wait_for_flush()
+
+    sim.set_ethtool_tc_offloads(True)
+    sim.dfs["bpf_tc_non_bound_accept"] = "Y"
+
+    start_test("Test TC offload by default...")
+    ret, _ = sim.cls_bpf_add_filter(obj, fail=False)
+    fail(ret != 0, "Software TC filter did not load")
+    sim.dfs_get_bound_progs(expected=0)
+    ingress = sim.tc_show_ingress(expected=1)
+    fltr = ingress[0]
+    fail(not fltr["in_hw"], "Filter not offloaded by default")
+
+    sim.tc_flush_filters()
+
+    start_test("Test TC cBPF bytcode tries offload by default...")
+    ret, _ = sim.cls_bpf_add_filter(bytecode, fail=False)
+    fail(ret != 0, "Software TC filter did not load")
+    sim.dfs_get_bound_progs(expected=0)
+    ingress = sim.tc_show_ingress(expected=1)
+    fltr = ingress[0]
+    fail(not fltr["in_hw"], "Bytecode not offloaded by default")
+
+    sim.tc_flush_filters()
+    sim.dfs["bpf_tc_non_bound_accept"] = "N"
+
+    start_test("Test TC cBPF unbound bytecode doesn't offload...")
+    ret, _ = sim.cls_bpf_add_filter(bytecode, skip_sw=True, fail=False)
+    fail(ret == 0, "TC bytecode loaded for offload")
+    sim.wait_for_flush()
+
+    start_test("Test TC offloads work...")
+    ret, _ = sim.cls_bpf_add_filter(obj, skip_sw=True, fail=False)
+    fail(ret != 0, "TC filter did not load with TC offloads enabled")
+
+    start_test("Test TC offload basics...")
+    dfs = sim.dfs_get_bound_progs(expected=1)
+    progs = bpftool_prog_list(expected=1)
+    ingress = sim.tc_show_ingress(expected=1)
+
+    dprog = dfs[0]
+    prog = progs[0]
+    fltr = ingress[0]
+    fail(fltr["skip_hw"], "TC does reports 'skip_hw' on offloaded filter")
+    fail(not fltr["in_hw"], "TC does not report 'in_hw' for offloaded filter")
+    fail(not fltr["skip_sw"], "TC does not report 'skip_sw' back")
+
+    start_test("Test TC offload is device-bound...")
+    fail(str(prog["id"]) != fltr["id"], "Program IDs don't match")
+    fail(prog["tag"] != fltr["tag"], "Program tags don't match")
+    fail(fltr["id"] != dprog["id"], "Program IDs don't match")
+    fail(dprog["state"] != "xlated", "Offloaded program state not translated")
+    fail(dprog["loaded"] != "Y", "Offloaded program is not loaded")
+
+    start_test("Test disabling TC offloads is rejected while filters installed...")
+    ret, _ = sim.set_ethtool_tc_offloads(False, fail=False)
+    fail(ret == 0, "Driver should refuse to disable TC offloads with filters installed...")
+
+    start_test("Test qdisc removal frees things...")
+    sim.tc_flush_filters()
+    sim.tc_show_ingress(expected=0)
+
+    start_test("Test disabling TC offloads is OK without filters...")
+    ret, _ = sim.set_ethtool_tc_offloads(False, fail=False)
+    fail(ret != 0,
+         "Driver refused to disable TC offloads without filters installed...")
+
+    sim.set_ethtool_tc_offloads(True)
+
+    start_test("Test destroying device gets rid of TC filters...")
+    sim.cls_bpf_add_filter(obj, skip_sw=True)
+    sim.remove()
+    bpftool_prog_list_wait(expected=0)
+
+    sim = NetdevSim()
+    sim.set_ethtool_tc_offloads(True)
+
+    start_test("Test destroying device gets rid of XDP...")
+    sim.set_xdp(obj, "offload")
+    sim.remove()
+    bpftool_prog_list_wait(expected=0)
+
+    sim = NetdevSim()
+    sim.set_ethtool_tc_offloads(True)
+
+    start_test("Test XDP prog reporting...")
+    sim.set_xdp(obj, "drv")
+    ipl = sim.ip_link_show(xdp=True)
+    progs = bpftool_prog_list(expected=1)
+    fail(ipl["xdp"]["prog"]["id"] != progs[0]["id"],
+         "Loaded program has wrong ID")
+
+    start_test("Test XDP prog replace without force...")
+    ret, _ = sim.set_xdp(obj, "drv", fail=False)
+    fail(ret == 0, "Replaced XDP program without -force")
+    sim.wait_for_flush(total=1)
+
+    start_test("Test XDP prog replace with force...")
+    ret, _ = sim.set_xdp(obj, "drv", force=True, fail=False)
+    fail(ret != 0, "Could not replace XDP program with -force")
+    bpftool_prog_list_wait(expected=1)
+    ipl = sim.ip_link_show(xdp=True)
+    progs = bpftool_prog_list(expected=1)
+    fail(ipl["xdp"]["prog"]["id"] != progs[0]["id"],
+         "Loaded program has wrong ID")
+
+    start_test("Test XDP prog replace with bad flags...")
+    ret, _ = sim.set_xdp(obj, "offload", force=True, fail=False)
+    fail(ret == 0, "Replaced XDP program with a program in different mode")
+    ret, _ = sim.set_xdp(obj, "", force=True, fail=False)
+    fail(ret == 0, "Replaced XDP program with a program in different mode")
+
+    start_test("Test XDP prog remove with bad flags...")
+    ret, _ = sim.unset_xdp("offload", force=True, fail=False)
+    fail(ret == 0, "Removed program with a bad mode mode")
+    ret, _ = sim.unset_xdp("", force=True, fail=False)
+    fail(ret == 0, "Removed program with a bad mode mode")
+
+    start_test("Test MTU restrictions...")
+    ret, _ = sim.set_mtu(9000, fail=False)
+    fail(ret == 0,
+         "Driver should refuse to increase MTU to 9000 with XDP loaded...")
+    sim.unset_xdp("drv")
+    bpftool_prog_list_wait(expected=0)
+    sim.set_mtu(9000)
+    ret, _ = sim.set_xdp(obj, "drv", fail=False)
+    fail(ret == 0, "Driver should refuse to load program with MTU of 9000...")
+    sim.set_mtu(1500)
+
+    sim.wait_for_flush()
+    start_test("Test XDP offload...")
+    sim.set_xdp(obj, "offload")
+    ipl = sim.ip_link_show(xdp=True)
+    link_xdp = ipl["xdp"]["prog"]
+    progs = bpftool_prog_list(expected=1)
+    prog = progs[0]
+    fail(link_xdp["id"] != prog["id"], "Loaded program has wrong ID")
+
+    start_test("Test XDP offload is device bound...")
+    dfs = sim.dfs_get_bound_progs(expected=1)
+    dprog = dfs[0]
+
+    fail(prog["id"] != link_xdp["id"], "Program IDs don't match")
+    fail(prog["tag"] != link_xdp["tag"], "Program tags don't match")
+    fail(str(link_xdp["id"]) != dprog["id"], "Program IDs don't match")
+    fail(dprog["state"] != "xlated", "Offloaded program state not translated")
+    fail(dprog["loaded"] != "Y", "Offloaded program is not loaded")
+
+    start_test("Test removing XDP program many times...")
+    sim.unset_xdp("offload")
+    sim.unset_xdp("offload")
+    sim.unset_xdp("drv")
+    sim.unset_xdp("drv")
+    sim.unset_xdp("")
+    sim.unset_xdp("")
+    bpftool_prog_list_wait(expected=0)
+
+    start_test("Test attempt to use a program for a wrong device...")
+    sim2 = NetdevSim()
+    sim2.set_xdp(obj, "offload")
+    pin_file, pinned = pin_prog("/sys/fs/bpf/tmp")
+
+    ret, _ = sim.set_xdp(pinned, "offload", fail=False)
+    fail(ret == 0, "Pinned program loaded for a different device accepted")
+    sim2.remove()
+    ret, _ = sim.set_xdp(pinned, "offload", fail=False)
+    fail(ret == 0, "Pinned program loaded for a removed device accepted")
+    rm(pin_file)
+    bpftool_prog_list_wait(expected=0)
+
+    start_test("Test mixing of TC and XDP...")
+    sim.tc_add_ingress()
+    sim.set_xdp(obj, "offload")
+    ret, _ = sim.cls_bpf_add_filter(obj, skip_sw=True, fail=False)
+    fail(ret == 0, "Loading TC when XDP active should fail")
+    sim.unset_xdp("offload")
+    sim.wait_for_flush()
+
+    sim.cls_bpf_add_filter(obj, skip_sw=True)
+    ret, _ = sim.set_xdp(obj, "offload", fail=False)
+    fail(ret == 0, "Loading XDP when TC active should fail")
+
+    start_test("Test binding TC from pinned...")
+    pin_file, pinned = pin_prog("/sys/fs/bpf/tmp")
+    sim.tc_flush_filters(bound=1, total=1)
+    sim.cls_bpf_add_filter(pinned, da=True, skip_sw=True)
+    sim.tc_flush_filters(bound=1, total=1)
+
+    start_test("Test binding XDP from pinned...")
+    sim.set_xdp(obj, "offload")
+    pin_file, pinned = pin_prog("/sys/fs/bpf/tmp2", idx=1)
+
+    sim.set_xdp(pinned, "offload", force=True)
+    sim.unset_xdp("offload")
+    sim.set_xdp(pinned, "offload", force=True)
+    sim.unset_xdp("offload")
+
+    start_test("Test offload of wrong type fails...")
+    ret, _ = sim.cls_bpf_add_filter(pinned, da=True, skip_sw=True, fail=False)
+    fail(ret == 0, "Managed to attach XDP program to TC")
+
+    start_test("Test asking for TC offload of two filters...")
+    sim.cls_bpf_add_filter(obj, da=True, skip_sw=True)
+    sim.cls_bpf_add_filter(obj, da=True, skip_sw=True)
+    # The above will trigger a splat until TC cls_bpf drivers are fixed
+
+    sim.tc_flush_filters(bound=2, total=2)
+
+    start_test("Test if netdev removal waits for translation...")
+    delay_msec = 500
+    sim.dfs["bpf_bind_verifier_delay"] = delay_msec
+    start = time.time()
+    cmd_line = "tc filter add dev %s ingress bpf %s da skip_sw" % \
+               (sim['ifname'], obj)
+    tc_proc = cmd(cmd_line, background=True, fail=False)
+    # Wait for the verifier to start
+    while sim.dfs_num_bound_progs() <= 2:
+        pass
+    sim.remove()
+    end = time.time()
+    ret, _ = cmd_result(tc_proc, fail=False)
+    time_diff = end - start
+    log("Time", "start:\t%s\nend:\t%s\ndiff:\t%s" % (start, end, time_diff))
+
+    fail(ret == 0, "Managed to load TC filter on a unregistering device")
+    delay_sec = delay_msec * 0.001
+    fail(time_diff < delay_sec, "Removal process took %s, expected %s" %
+         (time_diff, delay_sec))
+
+    print("%s: OK" % (os.path.basename(__file__)))
+
+finally:
+    log("Clean up...", "", level=1)
+    log_level_inc()
+    clean_up()