Age | Commit message (Collapse) | Author | Files | Lines |
|
For ACLs implemented using either FIB rules or FIB entries, the BPF
program needs the FIB lookup status to be able to drop the packet.
Since the bpf_fib_lookup API has not reached a released kernel yet,
change the return code to contain an encoding of the FIB lookup
result and return the nexthop device index in the params struct.
In addition, inform the BPF program of any post FIB lookup reason as
to why the packet needs to go up the stack.
The fib result for unicast routes must have an egress device, so remove
the check that it is non-NULL.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Flag with FLAG_EXPECTED_FAIL the BPF_MAXINSNS tests that cannot be jited
on s390 because they exceed BPF_SIZE_MAX and fail when
CONFIG_BPF_JIT_ALWAYS_ON is set. Also set .expected_errcode to -ENOTSUPP
so the tests pass in that case.
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
test_lwt_seg6local.sh testing script
This test needs root privilege for it's successful execution.
This patch is atleast used to notify the user about the privilege
the script demands for the smooth execution of the test.
Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
test_lirc_mode2.sh testing script
The test_lirc_mode2.sh script require root privilege for the successful
execution of the test.
This patch is to notify the user about the privilege the script
demands for the successful execution of the test.
Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
CONFIG_NET_SCHED wasn't enabled in arm64's defconfig only for x86.
So bpf/test_tunnel.sh tests fails with:
RTNETLINK answers: Operation not supported
RTNETLINK answers: Operation not supported
We have an error talking to the kernel, -1
Enable NET_SCHED and more tests pass.
Fixes: 3bce593ac06b ("selftests: bpf: config: add config fragments")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
If the kernel is compiled with CONFIG_CGROUP_BPF not enabled, it is not
possible to attach, detach or query IR BPF programs to /dev/lircN devices,
making them impossible to use. For embedded devices, it should be possible
to use IR decoding without cgroups or CONFIG_CGROUP_BPF enabled.
This change requires some refactoring, since bpf_prog_{attach,detach,query}
functions are now always compiled, but their code paths for cgroups need
moving out. Rather than a #ifdef CONFIG_CGROUP_BPF in kernel/bpf/syscall.c,
moving them to kernel/bpf/cgroup.c and kernel/bpf/sockmap.c does not
require #ifdefs since that is already conditionally compiled.
Fixes: f4364dcfc86d ("media: rc: introduce BPF_PROG_LIRC_MODE2")
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Stopping offload completely if replace of program failed dates
back to days of transparent offload. Back then we wanted to
silently fall back to the in-driver processing. Today we mark
programs for offload when they are loaded into the kernel, so
the transparent offload is no longer a reality.
Flags check in the driver will only allow replace of a driver
program with another driver program or an offload program with
another offload program.
When driver program is replaced stopping offload is a no-op,
because driver program isn't offloaded. When replacing
offloaded program if the offload fails the entire operation
will fail all the way back to user space and we should continue
using the old program. IOW when replacing a driver program
stopping offload is unnecessary and when replacing offloaded
program - it's a bug, old program should continue to run.
In practice this bug would mean that if offload operation was to
fail (either due to FW communication error, kernel OOM or new
program being offloaded but for a different netdev) driver
would continue reporting that previous XDP program is offloaded
but in fact no program will be loaded in hardware. The failure
is fairly unlikely (found by inspection, when working on the code)
but it's unpleasant.
Backport note: even though the bug was introduced in commit
cafa92ac2553 ("nfp: bpf: add support for XDP_FLAGS_HW_MODE"),
this fix depends on commit 441a33031fe5 ("net: xdp: don't allow
device-bound programs in driver mode"), so this fix is sufficient
only in v4.15 or newer. Kernels v4.13.x and v4.14.x do need to
stop offload if it was transparent/opportunistic, i.e. if
XDP_FLAGS_HW_MODE was not set on running program.
Fixes: cafa92ac2553 ("nfp: bpf: add support for XDP_FLAGS_HW_MODE")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
On one of our production test machine, when running
bpf selftest test_sockmap, I got the following error:
# sudo ./test_sockmap
libbpf: failed to create map (name: 'sock_map'): Operation not permitted
libbpf: failed to load object 'test_sockmap_kern.o'
libbpf: Can't get the 0th fd from program sk_skb1: only -1 instances
......
load_bpf_file: (-1) Operation not permitted
ERROR: (-1) load bpf failed
The error is due to not-big-enough rlimit
struct rlimit r = {10 * 1024 * 1024, RLIM_INFINITY};
The test already includes "bpf_rlimit.h", which sets current
and max rlimit to RLIM_INFINITY. Let us just use it.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
testing script
The test_kmod.sh script require root privilege for the successful
execution of the test.
This patch is to notify the user about the privilege the script
demands for the successful execution of the test.
Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Jakub Kicinski says:
====================
Two small fixes for error handling in bpftool prog load, first patch
removes a duplicated message, second one frees resources correctly.
Multiple error messages break JSON:
{
"error": "can't pin the object (/sys/fs/bpf/a): File exists"
},{
"error": "failed to pin program"
}
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Remembering to close all descriptors and free memory may not seem
important in a user space tool like bpftool, but if we were to run
in batch mode the consumed resources start to add up quickly. Make
sure program load closes the libbpf object (which unloads and frees
it).
Fixes: 49a086c201a9 ("bpftool: implement prog load command")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
do_pin_fd() will already print out an error message if something
goes wrong. Printing another error is unnecessary and will break
JSON output, since error messages are full objects:
$ bpftool -jp prog load tracex1_kern.o /sys/fs/bpf/a
{
"error": "can't pin the object (/sys/fs/bpf/a): File exists"
},{
"error": "failed to pin program"
}
Fixes: 49a086c201a9 ("bpftool: implement prog load command")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
sha: 702353b538f5 ("selftest: add test for TCP_INQ") forgot to add
tcp_inq to .gitignore.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When delta passed to gem_ptp_adjtime is negative, the sign is
maintained in the ns_to_timespec64 conversion. Hence timespec_add
should be used directly. timespec_sub will just subtract the negative
value thus increasing the time difference.
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 296d48568042 ("ipvlan: inherit MTU from master device") adjusted
the mtu from the master device when creating a ipvlan device, but it
would also override the mtu value set in rtnl_create_link. It causes
IFLA_MTU param not to take effect.
So this patch is to not adjust the mtu if IFLA_MTU param is set when
creating a ipvlan device.
Fixes: 296d48568042 ("ipvlan: inherit MTU from master device")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently it is incrementing SctpFragUsrMsgs when the user message size
is of the exactly same size as the maximum fragment size, which is wrong.
The fix is to increment it only when user message is bigger than the
maximum fragment size.
Fixes: bfd2e4b8734d ("sctp: refactor sctp_datamsg_from_user")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After commit 9facc336876f ("bpf: reject any prog that failed read-only lock")
offsetof(struct bpf_binary_header, image) became 3 instead of 4,
breaking powerpc BPF badly, since instructions need to be word aligned.
Fixes: 9facc336876f ("bpf: reject any prog that failed read-only lock")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When injecting frames in the Ocelot switch driver an injection header
(IFH) should be used to configure various parameters related to a given
frame, such as the port onto which the frame should be departed or its
vlan id. Other parameters in the switch configuration can led to an
injected frame being sent without an IFH but this led to various issues
as the per-frame parameters are then not used. This is especially true
when using multiple ports for injection.
The IFH was injected with the wrong endianness which led to the switch
not taking it into account as the IFH_INJ_BYPASS bit was then unset.
(The bit tells the switch to use the IFH over its internal
configuration). This patch fixes it.
In addition to the endianness fix, the IFH is also fixed. As it was
(unwillingly) unused, some of its fields were not configured the right
way.
Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Device tree based systems without of_dev_auxdata will have the mdio
device named differently than "davinci_mdio(.0)". In this case use the
device's parent's compatible string for matching
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass the correct thing to rtl8169_interrupt() from netpoll.
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Cc: netdev@vger.kernel.org
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Fixes: ebcd5daa7ffd ("r8169: change interrupt handler argument type")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In function strp_data_ready(), it is useless to call queue_work if
the state of strparser is already paused. The state checking should
be done before calling queue_work. The change reduces the context
switches and improves the ktls-rx throughput by approx 20% (measured
on cortex-a53 based platform).
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add fragments to pass bridge and vlan tests.
Fixes: 33b01b7b4f19 ("selftests: add rtnetlink test script")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use $(OBJDUMP) instead of literal 'objdump' to avoid
using host toolchain when cross compiling.
Fixes: 421780fd4983 ("bpfilter: fix build error")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull rdma fixes from Jason Gunthorpe:
"Here are eight fairly small fixes collected over the last two weeks.
Regression and crashing bug fixes:
- mlx4/5: Fixes for issues found from various checkers
- A resource tracking and uverbs regression in the core code
- qedr: NULL pointer regression found during testing
- rxe: Various small bugs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/rxe: Fix missing completion for mem_reg work requests
RDMA/core: Save kernel caller name when creating CQ using ib_create_cq()
IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write
IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()'
RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM
IB/mlx5: Fix return value check in flow_counters_set_data()
IB/mlx5: Fix memory leak in mlx5_ib_create_flow
IB/rxe: avoid double kfree skb
|
|
Pull networking fixes from David Miller:
1) Fix crash on bpf_prog_load() errors, from Daniel Borkmann.
2) Fix ATM VCC memory accounting, from David Woodhouse.
3) fib6_info objects need RCU freeing, from Eric Dumazet.
4) Fix SO_BINDTODEVICE handling for TCP sockets, from David Ahern.
5) Fix clobbered error code in enic_open() failure path, from
Govindarajulu Varadarajan.
6) Propagate dev_get_valid_name() error returns properly, from Li
RongQing.
7) Fix suspend/resume in davinci_emac driver, from Bartosz Golaszewski.
8) Various act_ife fixes (recursive locking, IDR leaks, etc.) from
Davide Caratti.
9) Fix buggy checksum handling in sungem driver, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
ip: limit use of gso_size to udp
stmmac: fix DMA channel hang in half-duplex mode
net: stmmac: socfpga: add additional ocp reset line for Stratix10
net: sungem: fix rx checksum support
bpfilter: ignore binary files
bpfilter: fix build error
net/usb/drivers: Remove useless hrtimer_active check
net/sched: act_ife: preserve the action control in case of error
net/sched: act_ife: fix recursive lock and idr leak
net: ethernet: fix suspend/resume in davinci_emac
net: propagate dev_get_valid_name return code
enic: do not overwrite error code
net/tcp: Fix socket lookups with SO_BINDTODEVICE
ptp: replace getnstimeofday64() with ktime_get_real_ts64()
net/ipv6: respect rcu grace period before freeing fib6_info
net: net_failover: fix typo in net_failover_slave_register()
ipvlan: use ETH_MAX_MTU as max mtu
net: hamradio: use eth_broadcast_addr
enic: initialize enic->rfs_h.lock in enic_probe
MAINTAINERS: Add Sam as the maintainer for NCSI
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- Wacom 2nd-gen Intuos Pro large Y axis handling fix from Jason Gerecke
- fix for hibernation in Intel ISH driver, from Even Xu
- crash fix for hid-steam driver, from Rodrigo Rivas Costa
- new device ID addition to google-hammer driver
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large
HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation
HID: steam: use hid_device.driver_data instead of hid_set_drvdata()
HID: google: Add support for whiskers
|
|
Pull dma-mapping rename from Christoph Hellwig:
"Move all the dma-mapping code to kernel/dma and lose their dma-*
prefixes"
* tag 'dma-rename-4.18' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: move all DMA mapping code to kernel/dma
dma-mapping: use obj-y instead of lib-y for generic dma ops
|
|
The HID descriptor for the 2nd-gen Intuos Pro large (PTH-860) contains
a typo which defines an incorrect logical maximum Y value. This causes
a small portion of the bottom of the tablet to become unusable (both
because the area is below the "bottom" of the tablet and because
'wacom_wac_event' ignores out-of-range values). It also results in a
skewed aspect ratio.
To fix this, we add a quirk to 'wacom_usage_mapping' which overwrites
the data with the correct value.
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
CC: stable@vger.kernel.org # v4.10+
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Current ISH driver only registers suspend/resume PM callbacks which don't
support hibernation (suspend to disk). Basically after hiberation, the ISH
can't resume properly and user may not see sensor events (for example: screen
rotation may not work).
User will not see a crash or panic or anything except the following message
in log:
hid-sensor-hub 001F:8086:22D8.0001: timeout waiting for response from ISHTP device
So this patch adds support for S4/hiberbation to ISH by using the
SIMPLE_DEV_PM_OPS() MACRO instead of struct dev_pm_ops directly. The suspend
and resume functions will now be used for both suspend to RAM and hibernation.
If power management is disabled, SIMPLE_DEV_PM_OPS will do nothing, the suspend
and resume related functions won't be used, so mark them as __maybe_unused to
clarify that this is the intended behavior, and remove #ifdefs for power
management.
Cc: stable@vger.kernel.org
Signed-off-by: Even Xu <even.xu@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
When creating the low-level hidraw device, the reference to steam_device
was stored using hid_set_drvdata(). But this value is not guaranteed to
be kept when set before calling probe. If this pointer is reset, it
crashes when opening the emulated hidraw device.
It looks like hid_set_drvdata() is for users "avobe" this hid_device,
while hid_device.driver_data it for users "below" this one.
In this case, we are creating a virtual hidraw device, so we must use
hid_device.driver_data.
Signed-off-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com>
Tested-by: Mariusz Ceier <mceier+kernel@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
The rewrite of the cmdline fetching missed the fact that we used to also
return the final terminating NUL character of the last argument. I
hadn't noticed, and none of the tools I tested cared, but something
obviously must care, because Michal Kubecek noticed the change in
behavior.
Tweak the "find the end" logic to actually include the NUL character,
and once past the eend of argv, always start the strnlen() at the
expected (original) argument end.
This whole "allow people to rewrite their arguments in place" is a nasty
hack and requires that odd slop handling at the end of the argv array,
but it's our traditional model, so we continue to support it.
Repored-and-bisected-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-and-tested-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The ipcm(6)_cookie field gso_size is set only in the udp path. The ip
layer copies this to cork only if sk_type is SOCK_DGRAM. This check
proved too permissive. Ping and l2tp sockets have the same type.
Limit to sockets of type SOCK_DGRAM and protocol IPPROTO_UDP to
exclude ping sockets.
v1 -> v2
- remove irrelevant whitespace changes
Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT")
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
HW does not support Half-duplex mode in multi-queue
scenario. Fix it by not advertising the Half-Duplex
mode if multi-queue enabled.
Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Stratix10 platform has an additional reset line, OCP(Open Core Protocol),
that also needs to get deasserted for the stmmac ethernet controller to work.
Thus we need to update the Kconfig to include ARCH_STRATIX10 in order to build
dwmac-socfpga.
Also, remove the redundant check for the reset controller pointer. The
reset driver already checks for the pointer and returns 0 if the pointer
is NULL.
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE
are friends"), sungem owners reported the infamous "eth0: hw csum failure"
message.
CHECKSUM_COMPLETE has in fact never worked for this driver, but this
was masked by the fact that upper stacks had to strip the FCS, and
therefore skb->ip_summed was set back to CHECKSUM_NONE before
my recent change.
Driver configures a number of bytes to skip when the chip computes
the checksum, and for some reason only half of the Ethernet header
was skipped.
Then a second problem is that we should strip the FCS by default,
unless the driver is updated to eventually support NETIF_F_RXFCS in
the future.
Finally, a driver should check if NETIF_F_RXCSUM feature is enabled
or not, so that the admin can turn off rx checksum if wanted.
Many thanks to Andreas Schwab and Mathieu Malaterre for their
help in debugging this issue.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
net/bpfilter/bpfilter_umh is a binary file generated when bpfilter is
enabled, add it to .gitignore to avoid committing it.
Fixes: d2ba09c17a064 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
bpfilter Makefile assumes that the system locale is en_US, and the
parsing of objdump output fails.
Set LC_ALL=C and, while at it, rewrite the objdump parsing so it spawns
only 2 processes instead of 7.
Fixes: d2ba09c17a064 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The code does:
if (hrtimer_active(&t))
hrtimer_cancel(&t);
However, hrtimer_cancel() checks if the timer is active, so the
test above is pointless.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
in the following script
# tc actions add action ife encode allow prio pass index 42
# tc actions replace action ife encode allow tcindex drop index 42
the action control should remain equal to 'pass', if the kernel failed
to replace the TC action. Pospone the assignment of the action control,
to ensure it is not overwritten in the error path of tcf_ife_init().
Fixes: ef6980b6becb ("introduce IFE action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
a recursive lock warning [1] can be observed with the following script,
# $TC actions add action ife encode allow prio pass index 42
IFE type 0xED3E
# $TC actions replace action ife encode allow tcindex pass index 42
in case the kernel was unable to run the last command (e.g. because of
the impossibility to load 'act_meta_skbtcindex'). For a similar reason,
the kernel can leak idr in the error path of tcf_ife_init(), because
tcf_idr_release() is not called after successful idr reservation:
# $TC actions add action ife encode allow tcindex index 47
IFE type 0xED3E
RTNETLINK answers: No such file or directory
We have an error talking to the kernel
# $TC actions add action ife encode allow tcindex index 47
IFE type 0xED3E
RTNETLINK answers: No space left on device
We have an error talking to the kernel
# $TC actions add action ife encode use mark 7 type 0xfefe pass index 47
IFE type 0xFEFE
RTNETLINK answers: No space left on device
We have an error talking to the kernel
Since tcfa_lock is already taken when the action is being edited, a call
to tcf_idr_release() wrongly makes tcf_idr_cleanup() take the same lock
again. On the other hand, tcf_idr_release() needs to be called in the
error path of tcf_ife_init(), to undo the last tcf_idr_create() invocation.
Fix both problems in tcf_ife_init().
Since the cleanup() routine can now be called when ife->params is NULL,
also add a NULL pointer check to avoid calling kfree_rcu(NULL, rcu).
[1]
============================================
WARNING: possible recursive locking detected
4.17.0-rc4.kasan+ #417 Tainted: G E
--------------------------------------------
tc/3932 is trying to acquire lock:
000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_cleanup+0x19/0x80 [act_ife]
but task is already holding lock:
000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&p->tcfa_lock)->rlock);
lock(&(&p->tcfa_lock)->rlock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by tc/3932:
#0: 000000007ca8e990 (rtnl_mutex){+.+.}, at: tcf_ife_init+0xf61/0x13c0 [act_ife]
#1: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife]
stack backtrace:
CPU: 3 PID: 3932 Comm: tc Tainted: G E 4.17.0-rc4.kasan+ #417
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
dump_stack+0x9a/0xeb
__lock_acquire+0xf43/0x34a0
? debug_check_no_locks_freed+0x2b0/0x2b0
? debug_check_no_locks_freed+0x2b0/0x2b0
? debug_check_no_locks_freed+0x2b0/0x2b0
? __mutex_lock+0x62f/0x1240
? kvm_sched_clock_read+0x1a/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x39/0x1d0
? lock_acquire+0x10b/0x330
lock_acquire+0x10b/0x330
? tcf_ife_cleanup+0x19/0x80 [act_ife]
_raw_spin_lock_bh+0x38/0x70
? tcf_ife_cleanup+0x19/0x80 [act_ife]
tcf_ife_cleanup+0x19/0x80 [act_ife]
__tcf_idr_release+0xff/0x350
tcf_ife_init+0xdde/0x13c0 [act_ife]
? ife_exit_net+0x290/0x290 [act_ife]
? __lock_is_held+0xb4/0x140
tcf_action_init_1+0x67b/0xad0
? tcf_action_dump_old+0xa0/0xa0
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? kvm_sched_clock_read+0x1a/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? memset+0x1f/0x40
tcf_action_init+0x30f/0x590
? tcf_action_init_1+0xad0/0xad0
? memset+0x1f/0x40
tc_ctl_action+0x48e/0x5e0
? mutex_lock_io_nested+0x1160/0x1160
? tca_action_gd+0x990/0x990
? sched_clock+0x5/0x10
? find_held_lock+0x39/0x1d0
rtnetlink_rcv_msg+0x4da/0x990
? validate_linkmsg+0x680/0x680
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x39/0x1d0
netlink_rcv_skb+0x127/0x350
? validate_linkmsg+0x680/0x680
? netlink_ack+0x970/0x970
? __kmalloc_node_track_caller+0x304/0x3a0
netlink_unicast+0x40f/0x5d0
? netlink_attachskb+0x580/0x580
? _copy_from_iter_full+0x187/0x760
? import_iovec+0x90/0x390
netlink_sendmsg+0x67f/0xb50
? netlink_unicast+0x5d0/0x5d0
? copy_msghdr_from_user+0x206/0x340
? netlink_unicast+0x5d0/0x5d0
sock_sendmsg+0xb3/0xf0
___sys_sendmsg+0x60a/0x8b0
? copy_msghdr_from_user+0x340/0x340
? lock_downgrade+0x5e0/0x5e0
? tty_write_lock+0x18/0x50
? kvm_sched_clock_read+0x1a/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x39/0x1d0
? lock_downgrade+0x5e0/0x5e0
? lock_acquire+0x10b/0x330
? __audit_syscall_entry+0x316/0x690
? current_kernel_time64+0x6b/0xd0
? __fget_light+0x55/0x1f0
? __sys_sendmsg+0xd2/0x170
__sys_sendmsg+0xd2/0x170
? __ia32_sys_shutdown+0x70/0x70
? syscall_trace_enter+0x57a/0xd60
? rcu_read_lock_sched_held+0xdc/0x110
? __bpf_trace_sys_enter+0x10/0x10
? do_syscall_64+0x22/0x480
do_syscall_64+0xa5/0x480
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fd646988ba0
RSP: 002b:00007fffc9fab3c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fffc9fab4f0 RCX: 00007fd646988ba0
RDX: 0000000000000000 RSI: 00007fffc9fab440 RDI: 0000000000000003
RBP: 000000005b28c8b3 R08: 0000000000000002 R09: 0000000000000000
R10: 00007fffc9faae20 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffc9fab504 R14: 0000000000000001 R15: 000000000066c100
Fixes: 4e8c86155010 ("net sched: net sched: ife action fix late binding")
Fixes: ef6980b6becb ("introduce IFE action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch reverts commit 3243ff2a05ec ("net: ethernet: davinci_emac:
Deduplicate bus_find_device() by name matching") and adds a comment
which should stop anyone from reintroducing the same "fix" in the future.
We can't use bus_find_device_by_name() here because the device name is
not guaranteed to be 'davinci_mdio'. On some systems it can be
'davinci_mdio.0' so we need to use strncmp() against the first part of
the string to correctly match it.
Fixes: 3243ff2a05ec ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching")
Cc: stable@vger.kernel.org
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
if dev_get_valid_name failed, propagate its return code
and remove the setting err to ENODEV, it will be set to
0 again before dev_change_net_namespace exits.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In failure path, we overwrite err to what vnic_rq_disable() returns. In
case it returns 0, enic_open() returns success in case of error.
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Fixes: e8588e268509 ("enic: enable rq before updating rq descriptors")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar to 69678bcd4d2d ("udp: fix SO_BINDTODEVICE"), TCP socket lookups
need to fail if dev_match is not true. Currently, a packet to a given port
can match a socket bound to device when it should not. In the VRF case,
this causes the lookup to hit a VRF socket and not a global socket
resulting in a response trying to go through the VRF when it should not.
Fixes: 3fa6f616a7a4d ("net: ipv4: add second dif to inet socket lookups")
Fixes: 4297a0ef08572 ("net: ipv6: add second dif to inet6 socket lookups")
Reported-by: Lou Berger <lberger@labn.net>
Diagnosed-by: Renato Westphal <renato@opensourcerouting.org>
Tested-by: Renato Westphal <renato@opensourcerouting.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
getnstimeofday64() is deprecated and getting replaced throughout
the kernel with ktime_get_*() based helpers for a more consistent
interface.
The two functions do the exact same thing, so this is just
a cosmetic change.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot reported use after free that is caused by fib6_info being
freed without a proper RCU grace period.
CPU: 0 PID: 1407 Comm: udevd Not tainted 4.17.0+ #39
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
print_address_description+0x6c/0x20b mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
__asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
__read_once_size include/linux/compiler.h:188 [inline]
find_rr_leaf net/ipv6/route.c:705 [inline]
rt6_select net/ipv6/route.c:761 [inline]
fib6_table_lookup+0x12b7/0x14d0 net/ipv6/route.c:1823
ip6_pol_route+0x1c2/0x1020 net/ipv6/route.c:1856
ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2082
fib6_rule_lookup+0x211/0x6d0 net/ipv6/fib6_rules.c:122
ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2110
ip6_route_output include/net/ip6_route.h:82 [inline]
icmpv6_xrlim_allow net/ipv6/icmp.c:211 [inline]
icmp6_send+0x147c/0x2da0 net/ipv6/icmp.c:535
icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43
ip6_link_failure+0xa5/0x790 net/ipv6/route.c:2244
dst_link_failure include/net/dst.h:427 [inline]
ndisc_error_report+0xd1/0x1c0 net/ipv6/ndisc.c:695
neigh_invalidate+0x246/0x550 net/core/neighbour.c:892
neigh_timer_handler+0xaf9/0xde0 net/core/neighbour.c:978
call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x79e/0xc50 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2e0/0xaf5 kernel/softirq.c:284
invoke_softirq kernel/softirq.c:364 [inline]
irq_exit+0x1d1/0x200 kernel/softirq.c:404
exiting_irq arch/x86/include/asm/apic.h:527 [inline]
smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
</IRQ>
RIP: 0010:strlen+0x5e/0xa0 lib/string.c:482
Code: 24 00 74 3b 48 bb 00 00 00 00 00 fc ff df 4c 89 e0 48 83 c0 01 48 89 c2 48 89 c1 48 c1 ea 03 83 e1 07 0f b6 14 1a 38 ca 7f 04 <84> d2 75 23 80 38 00 75 de 48 83 c4 08 4c 29 e0 5b 41 5c 5d c3 48
RSP: 0018:ffff8801af117850 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: ffff880197f53bd0 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff81c5b06c RDI: ffff880197f53bc0
RBP: ffff8801af117868 R08: ffff88019a976540 R09: 0000000000000000
R10: ffff88019a976540 R11: 0000000000000000 R12: ffff880197f53bc0
R13: ffff880197f53bc0 R14: ffffffff899e4e90 R15: ffff8801d91c6a00
strlen include/linux/string.h:267 [inline]
getname_kernel+0x24/0x370 fs/namei.c:218
open_exec+0x17/0x70 fs/exec.c:882
load_elf_binary+0x968/0x5610 fs/binfmt_elf.c:780
search_binary_handler+0x17d/0x570 fs/exec.c:1653
exec_binprm fs/exec.c:1695 [inline]
__do_execve_file.isra.35+0x16fe/0x2710 fs/exec.c:1819
do_execveat_common fs/exec.c:1866 [inline]
do_execve fs/exec.c:1883 [inline]
__do_sys_execve fs/exec.c:1964 [inline]
__se_sys_execve fs/exec.c:1959 [inline]
__x64_sys_execve+0x8f/0xc0 fs/exec.c:1959
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f1576a46207
Code: 77 19 f4 48 89 d7 44 89 c0 0f 05 48 3d 00 f0 ff ff 76 e0 f7 d8 64 41 89 01 eb d8 f7 d8 64 41 89 01 eb df b8 3b 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 00 8c 2d 00 f7 d8 64 89 02
RSP: 002b:00007ffff2784568 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f1576a46207
RDX: 0000000001215b10 RSI: 00007ffff2784660 RDI: 00007ffff2785670
RBP: 0000000000625500 R08: 000000000000589c R09: 000000000000589c
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000001215b10
R13: 0000000000000007 R14: 0000000001204250 R15: 0000000000000005
Allocated by task 12188:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620
kmalloc include/linux/slab.h:513 [inline]
kzalloc include/linux/slab.h:706 [inline]
fib6_info_alloc+0xbb/0x280 net/ipv6/ip6_fib.c:152
ip6_route_info_create+0x782/0x2b50 net/ipv6/route.c:3013
ip6_route_add+0x23/0xb0 net/ipv6/route.c:3154
ipv6_route_ioctl+0x5a5/0x760 net/ipv6/route.c:3660
inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546
sock_do_ioctl+0xe4/0x3e0 net/socket.c:973
sock_ioctl+0x30d/0x680 net/socket.c:1097
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:500 [inline]
do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684
ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
__do_sys_ioctl fs/ioctl.c:708 [inline]
__se_sys_ioctl fs/ioctl.c:706 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 1402:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kfree+0xd9/0x260 mm/slab.c:3813
fib6_info_destroy+0x29b/0x350 net/ipv6/ip6_fib.c:207
fib6_info_release include/net/ip6_fib.h:286 [inline]
__ip6_del_rt_siblings net/ipv6/route.c:3235 [inline]
ip6_route_del+0x11c4/0x13b0 net/ipv6/route.c:3316
ipv6_route_ioctl+0x616/0x760 net/ipv6/route.c:3663
inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546
sock_do_ioctl+0xe4/0x3e0 net/socket.c:973
sock_ioctl+0x30d/0x680 net/socket.c:1097
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:500 [inline]
do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684
ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
__do_sys_ioctl fs/ioctl.c:708 [inline]
__se_sys_ioctl fs/ioctl.c:706 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8801b5df2580
which belongs to the cache kmalloc-256 of size 256
The buggy address is located 8 bytes inside of
256-byte region [ffff8801b5df2580, ffff8801b5df2680)
The buggy address belongs to the page:
page:ffffea0006d77c80 count:1 mapcount:0 mapping:ffff8801da8007c0 index:0xffff8801b5df2e40
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffffea0006c5cc48 ffffea0007363308 ffff8801da8007c0
raw: ffff8801b5df2e40 ffff8801b5df2080 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8801b5df2480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801b5df2500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ffff8801b5df2580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8801b5df2600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801b5df2680: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
Fixes: a64efe142f5e ("net/ipv6: introduce fib6_info struct and helpers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ahern <dsahern@gmail.com>
Reported-by: syzbot+9e6d75e3edef427ee888@syzkaller.appspotmail.com
Acked-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Sync both unicast and multicast lists instead of unicast twice.
Fixes: cfc80d9a116 ("net: Introduce net_failover driver")
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Similar to the fixes on team and bonding, this restores the ability
to set an ipvlan device's mtu to anything higher than 1500.
Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The array bpq_eth_addr is only used to get the size of an
address, whereas the bcast_addr is used to set the broadcast
address. This leads to a warning when using clang:
drivers/net/hamradio/bpqether.c:94:13: warning: variable 'bpq_eth_addr' is not
needed and will not be emitted [-Wunneeded-internal-declaration]
static char bpq_eth_addr[6];
^
Remove both variables and use the common eth_broadcast_addr
to set the broadcast address.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
lockdep spotted that we are using rfs_h.lock in enic_get_rxnfc() without
initializing. rfs_h.lock is initialized in enic_open(). But ethtool_ops
can be called when interface is down.
Move enic_rfs_flw_tbl_init to enic_probe.
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 18 PID: 1189 Comm: ethtool Not tainted 4.17.0-rc7-devel+ #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
dump_stack+0x85/0xc0
register_lock_class+0x550/0x560
? __handle_mm_fault+0xa8b/0x1100
__lock_acquire+0x81/0x670
lock_acquire+0xb9/0x1e0
? enic_get_rxnfc+0x139/0x2b0 [enic]
_raw_spin_lock_bh+0x38/0x80
? enic_get_rxnfc+0x139/0x2b0 [enic]
enic_get_rxnfc+0x139/0x2b0 [enic]
ethtool_get_rxnfc+0x8d/0x1c0
dev_ethtool+0x16c8/0x2400
? __mutex_lock+0x64d/0xa00
? dev_load+0x6a/0x150
dev_ioctl+0x253/0x4b0
sock_do_ioctl+0x9a/0x130
sock_ioctl+0x1af/0x350
do_vfs_ioctl+0x8e/0x670
? syscall_trace_enter+0x1e2/0x380
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x5a/0x170
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|