Age | Commit message (Collapse) | Author | Files | Lines |
|
Files removed in 'net-next' had their license header updated
in 'net'. We take the remove from 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Return success if the same dispatch function is being registered for
a given opcode and subcode, there by allow multiple switchdev enable
and disables.
Signed-off-by: Vijaya Mohan Guvva <vijaya.guvva@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Jiri Pirko says:
====================
mlxsw: Handle changes in GRE configuration
Petr says:
Until now, when an IP tunnel was offloaded by the mlxsw driver, the
offload was pretty much static, and changes in Linux configuration were
not reflected in the hardware. That led to discrepancies between traffic
flows in slow path and fast path. The work-around used to be to remove
all routes that forward to the netdevice and re-add them. This is
clearly suboptimal, but actually, as of the decap-only patchset, it's
not even enough anymore, and one needs to go all the way and simply drop
the tunnel and recreate it correctly.
With this patchset, the NETDEV_CHANGE events that are generated for
changes of up'd tunnel netdevices are captured and interpreted to
correctly reconfigure the HW in accordance with changes requested at the
software layer. In addition, NETDEV_CHANGEUPPER, NETDEV_UP and
NETDEV_DOWN are now handled not only for tunnel devices themselves, but
also for their bound devices. Each change is then translated to one or
more of the following updates to the HW configuration:
- refresh of offload of local route that corresponds to tunnel's local
address
- refresh of the loopback RIF
- refresh of offloads of routes that forward to the changed tunnel
- removal of tunnel offloads
These tools are used to implement the following configuration changes:
- addition of a new offloadable tunnel with local address that conflicts
with that of an already-offloaded tunnel (the existing tunnel is
onloaded, the new one isn't offloaded)
- changes to TTL, TOS that make tunnel unsuitable for offloading
- changes to ikey, okey, remote
- changes to local, which when they cause conflict with another
tunnel, lead to onloading of both newly-conflicting tunnels
- migration of a bound device of an offloaded tunnel device to a
different VRF
- changes to what device is bound to a tunnel device (i.e. like what
"ip tunnel change name g dev another" does)
- changes to up / down state of a bound device. A down bound device
doesn't forward encapsulated traffic anymore, but decap still works.
This patchset starts with a suite of patches that adapt the existing
code base step by step to facilitate introduction of the offloading
code. The five substantial patches at the end then implement the changes
mentioned above.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the bound device of a tunnel device is down, encapsulated packets
are not egressed anymore, but tunnel decap still works. Extend
mlxsw_sp_nexthop_rif_update() to take IFF_UP into consideration when
deciding whether a given next hop should be offloaded.
Because the new logic was added to mlxsw_sp_nexthop_rif_update(), this
fixes the case where a newly-added tunnel has a down bound device, which
would previously be fully offloaded. Now the down state of the bound
device is noted and next hops forwarding to such tunnel are not
offloaded.
In addition to that, notice NETDEV_UP and NETDEV_DOWN of a bound device
to force refresh of tunnel encap route offloads.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a bound device of an IP-in-IP tunnel changes, such as through
'ip tunnel change name $name dev $dev', the loopback backing the tunnel
needs to be recreated.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Changes to L3 tunnel netdevices (through `ip tunnel change' as well as
`ip link set') lead to NETDEV_CHANGE being generated on the tunnel
device. Because what is relevant for the tunnel in question depends on
the tunnel type, handling of the event is dispatched to the IPIP module
through a newly-added interface mlxsw_sp_ipip_ops.ol_netdev_change().
IPIP tunnels now remember the last set of tunnel parameters in struct
mlxsw_sp_ipip_entry.parms, and use it to figure out what exactly has
changed.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a bound device of a tunnel netdevice changes VRF, the loopback RIF
that backs the tunnel needs to be updated and existing encapsulating
routes need to be refreshed.
Note that several tunnels can share the same bound device, in which case
all the impacted tunnels need to be updated.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The approach for offloading IP tunnels implemented currently by mlxsw
doesn't allow two tunnels that have the same local IP address in the
same (underlay) VRF. Previously, offloads were introduced on demand as
encap routes were formed. When such a route was created that would cause
offload of a conflicting tunnel, mlxsw_sp_ipip_entry_create() would
detect it and return -EEXIST, which would propagate up and cause FIB
abort.
Now however IPIP entries are created as soon as an offloadable netdevice
is created, and the failure prevents creation of such device.
Furthermore, if the driver is installed at the point where such
conflicting tunnels exist, the failure actually prevents successful
modprobe.
Furthermore, follow-up patches implement handling of NETDEV_CHANGE due
to the local address change. However, NETDEV_CHANGE can't be vetoed. The
failure merely means that the offloads weren't updated, but the change
in Linux configuration is not rolled back. It is thus desirable to have
a robust way of handling these conflicts, which can later be reused for
handling NETDEV_CHANGE as well.
To fix this, when a conflicting tunnel is created, instead of failing,
simply pull the old tunnel to slow path and reject offloading the
new one.
Introduce two functions: mlxsw_sp_ipip_entry_demote_tunnel() and
mlxsw_sp_ipip_demote_tunnel_by_saddr() to handle this. Make them both
public, because they will be useful later on in this patchset.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When trying to determine whether there are other offloaded tunnels with
the same local address, mlxsw_sp_ipip_entry_create() should look for a
tunnel with matching UL protocol, matching saddr, in the same VRF.
However instead of taking into account the UL protocol of the tunnel
netdevice (which mlxsw_sp_ipip_entry_saddr_matches() then compares to
the UL protocol of inspected IPIP entry), it deduces the UL protocol
from the inspected IPIP entry (and that's compared to itself).
This is currently immaterial, because only one tunnel type is offloaded,
and therefore the UL protocol always matches, but introducing support
for a tunnel with IPv6 underlay would uncover this error.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The work that needs to be done to update HW configuration in response to
changes is similar to what __mlxsw_sp_ipip_entry_update_tunnel() already
does, but with a number of twists: each change requires a different
subset of things to happen. Extend the function to support all these
uses, and allow finely-grained configuration of what should happen at
each call through a suite of function arguments.
Publish the updated function to allow use from the spectrum_ipip module.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The work that's done by mlxsw_sp_netdevice_ipip_ol_vrf_event() is a good
basis for a more versatile function that would take care of all sorts of
tunnel updates requests: __mlxsw_sp_ipip_entry_update_tunnel(). Extract
that function. Factor out a helper mlxsw_sp_ipip_entry_ol_lb_update() as
well.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function mlxsw_sp_rif_create() takes an extack parameter. So far,
for creation of loopback interfaces, NULL was passed. For some events
however the extack can be extracted and passed along. So do that for
NETDEV_CHANGEUPPER handler.
Use the opportunity to update the type of info argument that
mlxsw_sp_netdevice_ipip_ol_event() takes. Follow-up patches will
introduce handling of more changes, and some of them carry an extack as
well, but in an info structure of a different type. Though not strictly
erroneous (the pointer could be cast whichever way), it makes no sense
to pretend the value is always of a certain type, when in fact it isn't.
So change the prototype of the above-mentioned function as well.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The piece of logic to promote decap route, if any, is useful for generic
tunnel updates, not just for handling of NETDEV_UP events on tunnel
interfaces. Extract it to a separate function.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This function only ever returns 0, so don't pretend it returns anything
useful and just make it void.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To implement NETDEV_CHANGE notifications on IP-in-IP tunnels, the
handler needs to figure out what actually changed, to understand how
exactly to update the offloads. It will do so by storing struct
ip_tunnel_parm with previous configuration, and comparing that to the
new version.
To facilitate these comparisons, extract the code that operates on
struct ip_tunnel_parm from the existing accessor functions, and make
those a thin wrapper that extracts tunnel parameters and dispatches.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
These functions ideologically belong to the IPIP module, and some
follow-up work will benefit from their presence there.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some of the code down the road needs this logic as well.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
To distinguish between events related to tunnel device itself and its
bound device, rename a number of functions related to handling tunneling
netdevice events to include _ol_ (for "overlay") in the name. That
leaves room in the namespace for underlay-related functions, which would
have _ul_ in the name.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One fix for USB clks on Uniphier PXs3 SoCs"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: uniphier: fix clock data for PXs3
|
|
Pull arch/tile fixes from Chris Metcalf:
"Two one-line bug fixes"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: Implement ->set_state_oneshot_stopped()
tile: pass machine size to sparse
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"One minor fix in the error leg of the qla2xxx driver (it oopses the
system if we get an error trying to start the internal kernel thread).
The fix is minor because the problem isn't often encountered in the
field (although it can be induced by inserting the module in a low
memory environment)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Fix oops in qla2x00_probe_one error path
|
|
set_state_oneshot_stopped() is called by the clkevt core, when the
next event is required at an expiry time of 'KTIME_MAX'. This normally
happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.
This patch makes the clockevent device to stop on such an event, to
avoid spurious interrupts, as explained by: commit 8fff52fd5093
("clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state").
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.14.
This is bigger than I like to send at rc7, but that's at least partly
because I didn't send any fixes last week. If it wasn't for the IMC
driver, which is new and getting heavy testing, the diffstat would
look a bit better. I've also added ftrace on big endian to my test
suite, so we shouldn't break that again in future.
- A fix to the handling of misaligned paste instructions (P9 only),
where a change to a #define has caused the check for the
instruction to always fail.
- The preempt handling was unbalanced in the radix THP flush (P9
only). Though we don't generally use preempt we want to keep it
working as much as possible.
- Two fixes for IMC (P9 only), one when booting with restricted
number of CPUs and one in the error handling when initialisation
fails due to firmware etc.
- A revert to fix function_graph on big endian machines, and then a
rework of the reverted patch to fix kprobes blacklist handling on
big endian machines.
Thanks to: Anju T Sudhakar, Guilherme G. Piccoli, Madhavan Srinivasan,
Naveen N. Rao, Nicholas Piggin, Paul Mackerras"
* tag 'powerpc-4.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/perf: Fix core-imc hotplug callback failure during imc initialization
powerpc/kprobes: Dereference function pointers only if the address does not belong to kernel text
Revert "powerpc64/elfv1: Only dereference function descriptor for non-text symbols"
powerpc/64s/radix: Fix preempt imbalance in TLB flush
powerpc: Fix check for copy/paste instructions in alignment handler
powerpc/perf: Fix IMC allocation routine
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"Fix dw_mmc request timeout issues"
* tag 'mmc-v4.14-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: dw_mmc: Fix the DTO timeout calculation
mmc: dw_mmc: Add locking to the CTO timer
mmc: dw_mmc: Fix the CTO timeout calculation
mmc: dw_mmc: cancel the CTO timer after a voltage switch
|
|
git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
- one nouveau regression fix
- some amdgpu fixes for stable to fix hangs on some harvested Polaris
GPUs
- a set of KASAN and regression fixes for i915, their CI system seems
to be working pretty well now.
* tag 'drm-fixes-for-v4.14-rc8' of git://people.freedesktop.org/~airlied/linux:
drm/amdgpu: allow harvesting check for Polaris VCE
drm/amdgpu: return -ENOENT from uvd 6.0 early init for harvesting
drm/i915: Check incoming alignment for unfenced buffers (on i915gm)
drm/nouveau/kms/nv50: use the correct state for base channel notifier setup
drm/i915: Hold rcu_read_lock when iterating over the radixtree (vma idr)
drm/i915: Hold rcu_read_lock when iterating over the radixtree (objects)
drm/i915/edp: read edp display control registers unconditionally
drm/i915: Do not rely on wm preservation for ILK watermarks
drm/i915: Cancel the modeset retry work during modeset cleanup
|
|
Pull networking fixes from David Miller:
"Hopefully this is the last batch of networking fixes for 4.14
Fingers crossed...
1) Fix stmmac to use the proper sized OF property read, from Bhadram
Varka.
2) Fix use after free in net scheduler tc action code, from Cong
Wang.
3) Fix SKB control block mangling in tcp_make_synack().
4) Use proper locking in fib_dump_info(), from Florian Westphal.
5) Fix IPG encodings in systemport driver, from Florian Fainelli.
6) Fix division by zero in NV TCP congestion control module, from
Konstantin Khlebnikov.
7) Fix use after free in nf_reject_ipv4, from Tejaswi Tanikella"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: systemport: Correct IPG length settings
tcp: do not mangle skb->cb[] in tcp_make_synack()
fib: fib_dump_info can no longer use __in_dev_get_rtnl
stmmac: use of_property_read_u32 instead of read_u8
net_sched: hold netns refcnt for each action
net_sched: acquire RTNL in tc_action_net_exit()
net: vrf: correct FRA_L3MDEV encode type
tcp_nv: fix division by zero in tcpnv_acked()
netfilter: nf_reject_ipv4: Fix use-after-free in send_reset
netfilter: nft_set_hash: disable fast_ops for 2-len keys
|
|
Merge misc fixes from Andrew Morton:
"7 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm, swap: fix race between swap count continuation operations
mm/huge_memory.c: deposit page table when copying a PMD migration entry
initramfs: fix initramfs rebuilds w/ compression after disabling
fs/hugetlbfs/inode.c: fix hwpoison reserve accounting
ocfs2: fstrim: Fix start offset of first cluster group during fstrim
mm, /proc/pid/pagemap: fix soft dirty marking for PMD migration entry
userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size
|
|
MIPS will soon not be a part of Imagination Technologies, and as such
many @imgtec.com email addresses will no longer be valid. This patch
updates the addresses for those who:
- Have 10 or more patches in mainline authored using an @imgtec.com
email address, or any patches dated within the past year.
- Are still with Imagination but leaving as part of the MIPS business
unit, as determined from an internal email address list.
- Haven't already updated their email address (ie. JamesH) or expressed
a desire to be excluded (ie. Maciej).
- Acked v2 or earlier of this patch, which leaves Deng-Cheng, Matt &
myself.
New addresses are of the form firstname.lastname@mips.com, and all
verified against an internal email address list. An entry is added to
.mailmap for each person such that get_maintainer.pl will report the new
addresses rather than @imgtec.com addresses which will soon be dead.
Instances of the affected addresses throughout the tree are then
mechanically replaced with the new @mips.com address.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@mips.com>
Acked-by: Dengcheng Zhu <dengcheng.zhu@mips.com>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Matt Redfearn <matt.redfearn@mips.com>
Acked-by: Matt Redfearn <matt.redfearn@mips.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: trivial@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for
/proc/cpuinfo "cpu MHz"") is not sufficient to restore the previous
behavior of "cpu MHz" in /proc/cpuinfo on x86 due to some changes
made after the commit it has reverted.
To address this, make the code in question use arch_freq_get_on_cpu()
which also is used by cpufreq for reporting the current frequency of
CPUs and since that function doesn't really depend on cpufreq in any
way, drop the CONFIG_CPU_FREQ dependency for the object file
containing it.
Also refactor arch_freq_get_on_cpu() somewhat to avoid IPIs and
return cached values right away if it is called very often over a
short time (to prevent user space from triggering IPI storms through
it).
Fixes: 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
Cc: stable@kernel.org # 4.13 - together with 890da9cf0983
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
One page may store a set of entries of the sis->swap_map
(swap_info_struct->swap_map) in multiple swap clusters.
If some of the entries has sis->swap_map[offset] > SWAP_MAP_MAX,
multiple pages will be used to store the set of entries of the
sis->swap_map. And the pages are linked with page->lru. This is called
swap count continuation. To access the pages which store the set of
entries of the sis->swap_map simultaneously, previously, sis->lock is
used. But to improve the scalability of __swap_duplicate(), swap
cluster lock may be used in swap_count_continued() now. This may race
with add_swap_count_continuation() which operates on a nearby swap
cluster, in which the sis->swap_map entries are stored in the same page.
The race can cause wrong swap count in practice, thus cause unfreeable
swap entries or software lockup, etc.
To fix the race, a new spin lock called cont_lock is added to struct
swap_info_struct to protect the swap count continuation page list. This
is a lock at the swap device level, so the scalability isn't very well.
But it is still much better than the original sis->lock, because it is
only acquired/released when swap count continuation is used. Which is
considered rare in practice. If it turns out that the scalability
becomes an issue for some workloads, we can split the lock into some
more fine grained locks.
Link: http://lkml.kernel.org/r/20171017081320.28133-1-ying.huang@intel.com
Fixes: 235b62176712 ("mm/swap: add cluster lock")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shaohua Li <shli@kernel.org>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org> [4.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We need to deposit pre-allocated PTE page table when a PMD migration
entry is copied in copy_huge_pmd(). Otherwise, we will leak the
pre-allocated page and cause a NULL pointer dereference later in
zap_huge_pmd().
The missing counters during PMD migration entry copy process are added
as well.
The bug report is here: https://lkml.org/lkml/2017/10/29/214
Link: http://lkml.kernel.org/r/20171030144636.4836-1-zi.yan@sent.com
Fixes: 84c3fc4e9c563 ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is a follow-up to commit 57ddfdaa9a72 ("initramfs: fix disabling of
initramfs (and its compression)"). This particular commit fixed the use
case where we build the kernel with an initramfs with no compression,
and then we build the kernel with no initramfs.
Now this still left us with the same case as described here:
http://lkml.kernel.org/r/20170521033337.6197-1-f.fainelli@gmail.com
not working with initramfs compression. This can be seen by the
following steps/timestamps:
https://www.spinics.net/lists/kernel/msg2598153.html
.initramfs_data.cpio.gz.cmd is correct:
cmd_usr/initramfs_data.cpio.gz := /bin/bash
./scripts/gen_initramfs_list.sh -o usr/initramfs_data.cpio.gz -u 1000 -g 1000 /home/fainelli/work/uclinux-rootfs/romfs /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev
and was generated the first time we did generate the gzip initramfs, so
the command has not changed, nor its arguments, so we just don't call
it, no initramfs cpio is re-generated as a consequence.
The fix for this problem is just to properly keep track of the
.initramfs_cpio_data.d file by suffixing it with the compression
extension. This takes care of properly tracking dependencies such that
the initramfs get (re)generated any time files are added/deleted etc.
Link: http://lkml.kernel.org/r/20170930033936.6722-1-f.fainelli@gmail.com
Fixes: db2aa7fd15e8 ("initramfs: allow again choice of the embedded initramfs compression algorithm")
Fixes: 9e3596b0c653 ("kbuild: initramfs cleanup, set target from Kconfig")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Cc: "Francisco Blas Izquierdo Riera (klondike)" <klondike@xiscosoft.net>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Calling madvise(MADV_HWPOISON) on a hugetlbfs page will result in bad
(negative) reserved huge page counts. This may not happen immediately,
but may happen later when the underlying file is removed or filesystem
unmounted. For example:
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
HugePages_Total: 1
HugePages_Free: 0
HugePages_Rsvd: 18446744073709551615
HugePages_Surp: 0
Hugepagesize: 2048 kB
In routine hugetlbfs_error_remove_page(), hugetlb_fix_reserve_counts is
called after remove_huge_page. hugetlb_fix_reserve_counts is designed
to only be called/used only if a failure is returned from
hugetlb_unreserve_pages. Therefore, call hugetlb_unreserve_pages as
required and only call hugetlb_fix_reserve_counts in the unlikely event
that hugetlb_unreserve_pages returns an error.
Link: http://lkml.kernel.org/r/20171019230007.17043-2-mike.kravetz@oracle.com
Fixes: 78bb920344b8 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The first cluster group descriptor is not stored at the start of the
group but at an offset from the start. We need to take this into
account while doing fstrim on the first cluster group. Otherwise we
will wrongly start fstrim a few blocks after the desired start block and
the range can cross over into the next cluster group and zero out the
group descriptor there. This can cause filesytem corruption that cannot
be fixed by fsck.
Link: http://lkml.kernel.org/r/1507835579-7308-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When the pagetable is walked in the implementation of /proc/<pid>/pagemap,
pmd_soft_dirty() is used for both the PMD huge page map and the PMD
migration entries. That is wrong, pmd_swp_soft_dirty() should be used
for the PMD migration entries instead because the different page table
entry flag is used.
As a result, /proc/pid/pagemap may report incorrect soft dirty information
for PMD migration entries.
Link: http://lkml.kernel.org/r/20171017081818.31795-1-ying.huang@intel.com
Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Daniel Colascione <dancol@google.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This oops:
kernel BUG at fs/hugetlbfs/inode.c:484!
RIP: remove_inode_hugepages+0x3d0/0x410
Call Trace:
hugetlbfs_setattr+0xd9/0x130
notify_change+0x292/0x410
do_truncate+0x65/0xa0
do_sys_ftruncate.constprop.3+0x11a/0x180
SyS_ftruncate+0xe/0x10
tracesys+0xd9/0xde
was caused by the lack of i_size check in hugetlb_mcopy_atomic_pte.
mmap() can still succeed beyond the end of the i_size after vmtruncate
zapped vmas in those ranges, but the faults must not succeed, and that
includes UFFDIO_COPY.
We could differentiate the retval to userland to represent a SIGBUS like
a page fault would do (vs SIGSEGV), but it doesn't seem very useful and
we'd need to pick a random retval as there's no meaningful syscall
retval that would differentiate from SIGSEGV and SIGBUS, there's just
-EFAULT.
Link: http://lkml.kernel.org/r/20171016223914.2421-2-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Jiri Pirko says:
====================
net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath
This patchset's main patch is patch number 2. It carries the
description. Patch 1 is just a dependency.
---
v3->v4:
- rebased to be applicable on top of the current net-next
v2->v3:
- Using head change callback to replace miniq pointer every time tp head
changes. This eliminates one rcu dereference and makes the claim "without
added overhead" valid.
v1->v2:
- Use dev instead of skb->dev in sch_handle_egress as pointed out by Daniel
- Fixed synchronize_rcu_bh() in mini_qdisc_disable and commented
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In sch_handle_egress and sch_handle_ingress tp->q is used only in order
to update stats. So stats and filter list are the only things that are
needed in clsact qdisc fastpath processing. Introduce new mini_Qdisc
struct to hold those items. Also, introduce a helper to swap the
mini_Qdisc structures in case filter list head changes.
This removes need for tp->q usage without added overhead.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a callback that is to be called whenever head of the chain changes.
Also provide a callback for the default case when the caller gets a
block using non-extended getter.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Lipeng says:
====================
net: hns3: support set_link_ksettings and for nway_reset ethtool command
This patch-set adds support for set_link_ksettings && for nway_resets
ethtool command and fixes some related ethtool bugs.
1, patch[4/6] adds support for ethtool_ops.set_link_ksettings.
2, patch[5/6] adds support ethtool_ops.for nway_reset.
3, patch[1/6,2/6,3/6,6/6] fix some bugs for getting port information by
ethtool command(ethtool ethx).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes a bug for phy supported feature initialization.
Currently, the value of phydev->supported is initialized by kernel.
So it includes many features that we do not support, such as
SUPPORTED_FIBRE and SUPPORTED_BNC. This patch fixes it.
Fixes: 256727d (net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC)
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds nway_reset support for ethtool cmd.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds set_link_ksettings support for ethtool cmd.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The value of link_modes.advertising and the value of link_modes.supported
is initialized to zero every time in for loop in hns3_driv_to_eth_caps().
But we just want to set specified bit for them. Initialization is
unnecessary. This patch fixes it.
Fixes: 496d03e (net: hns3: Add Ethtool support to HNS3 driver)
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes a bug for ethtool's get_link_ksettings().
The advertising for autoneg is always added to advertised_caps
whether autoneg is enable or disable. This patch fixes it.
Fixes: 496d03e (net: hns3: Add Ethtool support to HNS3 driver)
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes a bug for ethtool's get_link_ksettings().
When phy exists, we should get autoneg from phy rather than from mac.
Because the value of mac.autoneg is invalid when phy exists.
Fixes: 496d03e (net: hns3: Add Ethtool support to HNS3 driver)
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Lipeng <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Michael Chan says:
====================
bnxt_en: Fix IRQ coalescing regressions.
There was a typo and missing guard-rail against illegal values in the
recent code clean up. All reported by Andy Gospodarek.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Recent IRQ coalescing clean up has removed a guard-rail for the max DMA
buffer coalescing value. This is a 6-bit value and must not be 0. We
already have a check for 0 but 64 is equivalent to 0 and will cause
non-stop interrupts. Fix it by adding the proper check.
Fixes: f8503969d27b ("bnxt_en: Refactor and simplify coalescing code.")
Reported-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Recent refactoring of coalesce settings contained a typo that prevents
receive settings from being set properly.
Fixes: 18775aa8a91f ("bnxt_en: Reorganize the coalescing parameters.")
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|