Age | Commit message (Collapse) | Author | Files | Lines |
|
xen_maybe_preempt_hcall() is called from the exception entry point
xen_do_hypervisor_callback with interrupts disabled.
_cond_resched() evades the might_sleep() check in cond_resched() which
would have caught that and schedule_debug() unfortunately lacks a check
for irqs_disabled().
Enable interrupts around the call and use cond_resched() to catch future
issues.
Fixes: fdfd811ddde3 ("x86/xen: allow privcmd hypercalls to be preempted")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/878skypjrh.fsf@nanos.tec.linutronix.de
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
|
|
Fix a varargs-related bug in print_synth_event() which resulted in
strange output and oopses on 32-bit x86 systems. The problem is that
trace_seq_printf() expects the varargs to match the format string, but
print_synth_event() was always passing u64 values regardless. This
results in unspecified behavior when unpacking with va_arg() in
trace_seq_printf().
Add a function that takes the size into account when calling
trace_seq_printf().
Before:
modprobe-1731 [003] .... 919.039758: gen_synth_test: next_pid_field=777(null)next_comm_field=hula hoops ts_ns=1000000 ts_ms=1000 cpu=3(null)my_string_field=thneed my_int_field=598(null)
After:
insmod-1136 [001] .... 36.634590: gen_synth_test: next_pid_field=777 next_comm_field=hula hoops ts_ns=1000000 ts_ms=1000 cpu=1 my_string_field=thneed my_int_field=598
Link: http://lkml.kernel.org/r/a9b59eb515dbbd7d4abe53b347dccf7a8e285657.1581720155.git.zanussi@kernel.org
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Commit 7276531d4036('tracing: Consolidate trace() functions')
inadvertently dropped the synth_event_trace() and
synth_event_trace_array() checks that verify the number of values
passed in matches the number of fields in the synthetic event being
traced, so add them back.
Link: http://lkml.kernel.org/r/32819cac708714693669e0dfe10fe9d935e94a16.1581720155.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
synth_event_trace(), synth_event_trace_array() and
__synth_event_add_val() write directly into the trace buffer and need
to take endianness into account, like trace_event_raw_event_synth()
does.
Link: http://lkml.kernel.org/r/2011354355e405af9c9d28abba430d1f5ff7771a.1581720155.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
synth_event_trace() is the varargs version of synth_event_trace_array(),
which takes an array of u64, as do synth_event_add_val() et al.
To not only be consistent with those, but also to address the fact
that synth_event_trace() expects every arg to be of the same type
since it doesn't also pass in e.g. a format string, the caller needs
to make sure all args are of the same type, u64. u64 is used because
it needs to accomodate the largest type available in synthetic events,
which is u64.
This fixes the bug reported by the kernel test robot/Rong Chen.
Link: https://lore.kernel.org/lkml/20200212113444.GS12867@shao2-debian/
Link: http://lkml.kernel.org/r/894c4e955558b521210ee0642ba194a9e603354c.1581720155.git.zanussi@kernel.org
Fixes: 9fe41efaca084 ("tracing: Add synth event generation test module")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Some code pathes rely on preempt_disable() to prevent migration on a non RT
enabled kernel. These preempt_disable/enable() pairs are substituted by
migrate_disable/enable() pairs or other forms of RT specific protection. On
RT these protections prevent migration but not preemption. Obviously a
cant_sleep() check in such a section will trigger on RT because preemption
is not disabled.
Provide a cant_migrate() macro which maps to cant_sleep() on a non RT
kernel and an empty placeholder for RT for now. The placeholder will be
changed to a proper debug check along with the RT specific migration
protection mechanism.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200214161503.070487511@linutronix.de
|
|
Code which solely needs to prevent migration of a task uses
preempt_disable()/enable() pairs. This is the only reliable way to do so
as setting the task affinity to a single CPU can be undone by a
setaffinity operation from a different task/process.
RT provides a seperate migrate_disable/enable() mechanism which does not
disable preemption to achieve the semantic requirements of a (almost) fully
preemptible kernel.
As it is unclear from looking at a given code path whether the intention is
to disable preemption or migration, introduce migrate_disable/enable()
inline functions which can be used to annotate code which merely needs to
disable migration. Map them to preempt_disable/enable() for now. The RT
substitution will be provided later.
Code which is annotated that way documents that it has no requirement to
protect against reentrancy of a preempting task. Either this is not
required at all or the call sites are already serialized by other means.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/878slclv1u.fsf@nanos.tec.linutronix.de
|
|
If BPF program is using BTF-defined maps, BTF is required only for
libbpf itself to process map definitions. If after that BTF fails to
be loaded into kernel (e.g., if it doesn't support BTF at all), this
shouldn't prevent valid BPF program from loading. Existing
retry-without-BTF logic for creating maps will succeed to create such
maps without any problems. So, presence of .maps section shouldn't make
BTF required for kernel. Update the check accordingly.
Validated by ensuring simple BPF program with BTF-defined maps is still
loaded on old kernel without BTF support and map is correctly parsed and
created.
Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF")
Reported-by: Julia Kartseva <hex@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200220062635.1497872-1-andriin@fb.com
|
|
Julian Wiedmann says:
====================
s390/qeth: fixes 2020-02-20
please apply the following patch series for qeth to netdev's net tree.
This corrects three minor issues:
1) return a more fitting errno when VNICC cmds are not supported,
2) remove a bogus WARN in the NAPI code, and
3) be _very_ pedantic about the RX copybreak.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The RX copybreak is intended as the _max_ value where the frame's data
should be copied. So for frame_len == copybreak, don't build an SG skb.
Fixes: 4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Calling napi->poll() with 0 budget is a legitimate use by netpoll.
Fixes: a1c3ed4c9ca0 ("qeth: NAPI support for l2 and l3 discipline")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When getting or setting VNICC parameters, the error code EOPNOTSUPP
should have precedence over EBUSY.
EBUSY is used because vnicc feature and bridgeport feature are mutually
exclusive, which is a temporary condition.
Whereas EOPNOTSUPP indicates that the HW does not support all or parts of
the vnicc feature.
This issue causes the vnicc sysfs params to show 'blocked by bridgeport'
for HW that does not support VNICC at all.
Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Trivial cleanup, so that all bridge port-specific code can be found in
one go.
CC: Johannes Berg <johannes@sipsolutions.net>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Functions starting with __ usually indicate those which are exported,
but should not be called directly. Update some of those declared in the
API and make it more readable.
page_pool_unmap_page() and page_pool_release_page() were doing
exactly the same thing calling __page_pool_clean_page(). Let's
rename __page_pool_clean_page() to page_pool_release_page() and
export it in order to show up on perf logs and get rid of
page_pool_unmap_page().
Finally rename __page_pool_put_page() to page_pool_put_page() since we
can now directly call it from drivers and rename the existing
page_pool_put_page() to page_pool_put_full_page() since they do the same
thing but the latter is trying to sync the full DMA area.
This patch also updates netsec, mvneta and stmmac drivers which use
those functions.
Suggested-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ido Schimmel says:
====================
mlxsw: Preparation for RTNL removal
The driver currently acquires RTNL in its route insertion path, which
contributes to very large control plane latencies. This patch set
prepares mlxsw for RTNL removal from its route insertion path in a
follow-up patch set.
Patches #1-#2 protect shared resources - KVDL and counter pool - with
their own locks. All allocations of these resources are currently
performed under RTNL, so no locks were required.
Patches #3-#7 ensure that updates to mirroring sessions only take place
in case there are active mirroring sessions. This allows us to avoid
taking RTNL when it is unnecessary, as updating of the mirroring
sessions must be performed under RTNL for the time being.
Patches #8-#10 replace the use of APIs that assume that RTNL is taken
with their RCU counterparts. Specifically, patches #8 and #9 replace
__in_dev_get_rtnl() with __in_dev_get_rcu() under RCU read-side critical
section. Patch #10 replaces __dev_get_by_index() with
dev_get_by_index_rcu().
Patches #11-#15 perform small adjustments in the code to make it easier
to later introduce a router lock instead of relying on RTNL.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The device supports a single VTEP whose configuration is shared between
all VXLAN tunnels.
While the shared configuration is cleared upon the destruction of the
last tunnel - in mlxsw_sp_nve_tunnel_fini() - it is set in
mlxsw_sp_nve_fid_enable(), after calling mlxsw_sp_nve_tunnel_init().
Make tunnel initialization and destruction symmetric and set the
configuration in mlxsw_sp_nve_tunnel_init().
This will later allow us to protect the shared configuration with a
lock.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
After the previous patch, all the callers of mlxsw_sp_rif_find_by_dev()
outside of the routing code use it to understand if a RIF exists for the
passed netdev.
Therefore, export a function to check if a RIF exists and make
mlxsw_sp_rif_find_by_dev() internal to the routing code.
This will later allow us to more easily introduce the router lock which
will also protect the RIFs.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are currently 5 users of mlxsw_sp_rif_find_by_dev() outside of the
routing code. Only one call site actually needs to dereference the
router interface (RIF). The rest merely need to know if a RIF exists for
the provided netdev.
Convert this call site to query the needed information directly from the
routing code instead of dereferencing the RIF.
This will later allow us to replace mlxsw_sp_rif_find_by_dev() with a
function that checks if a RIF exist.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function de-associates the port-vlan from its router interface
(RIF). It is called both from the netdev notifier block and the inetaddr
notifier block that will soon hold the router lock.
Make sure that router code calls the internal version, as it will
already have the router lock held when the function is called.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function removes the FDB entry that directs the macvlan's MAC to the
router port. It is called from both the netdev notifier block and the
inetaddr notifier block that will soon hold the router lock.
Make sure that only the netdev notifier calls the exported version, so
that is will take the router lock, which will already be held by the
inetaddr notifier.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
device
The function that resolves the underlay device of the IPIP tunnel
assumes that RTNL is taken, but this will not be correct when RTNL is
removed from the route insertion path.
Convert the function to use dev_get_by_index_rcu() instead of
__dev_get_by_index() and make sure it is always called from an RCU
read-side critical section.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IPv6 addresses are deleted in an atomic context, so the driver defers
the potential teardown of the associated router interface (RIF) to a
work item that takes RTNL.
The RIF is only destroyed if the associated netdev does not have any IP
addresses (both IPv4 and IPv6). The IPv4 device ('struct in_device') is
currently fetched via __in_dev_get_rtnl() which assumes RTNL is taken.
Since RTNL is going to be removed, convert it to use __in_dev_get_rcu()
from an RCU read-side critical section.
Note that the IPv6 device ('struct inet6_dev') is fetched via
__in6_dev_get(), which does not require RTNL.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RTNL is going to be removed from route insertion path, so use
__in_dev_get_rcu() from an RCU read-side critical section instead of
__in_dev_get_rtnl() which assumes RTNL is taken.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order not to needlessly schedule the work item that updates the
mirroring agents, only schedule it if there are any mirroring agents
present.
This is done by adding an atomic counter that counts the active
mirroring agents.
It is incremented / decremented whenever a mirroring agent is created /
destroyed. It is read before scheduling the work item and in the
devlink-resource occupancy callback.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previous patch added a work item in the mirroring code that will take
care of updating the active mirroring agents in response to different
events.
Change the mirroring agents update function - mlxsw_sp_span_respin() -
to invoke this work item when called.
Therefore there is no need for callers to schedule a work item
themselves.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The driver updates its mirroring agents whenever it receives a
notification about an event that can affect these. For example, the
addition of a route might require the driver to change the egress port
of an ERSPAN session.
Currently, RTNL needs to be held when these agents are updates, so the
driver either:
1. Calls directly into the mirroring code, in case RTNL is held
2. Schedules a work item that will take RTNL and call into the mirroring
code
Simplify this by having the mirroring code schedule the work item for
the update instead of requiring callers to schedule a work item
themselves.
The conversion of the callers will be done in the next patch to make
review easier.
This will later allow us to remove RTNL from different parts of the
driver. It will also allow us to only schedule the work item in case
there are active mirroring agents, which is information private to the
mirroring code.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Allocate the main mirroring struct and the individual structs for the
different mirroring agents in a single allocation.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The struct holding the different mirroring agents is currently allocated
as part of the main driver struct. This is unlike other driver modules.
Allocate the memory required to store the different mirroring agents as
part of the initialization of the mirroring module.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The counter pool is a shared resource. It is used by both the ACL code
to allocate counters for actions and by the routing code to allocate
counters for adjacency entries (for example).
Currently, all allocations are protected by RTNL, but this is going to
change with the removal of RTNL from the routing code.
Therefore, protect counter allocations with a spin lock.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The KVDL is used to store objects allocated throughout various places
in the driver. For example, both nexthops (adjacency entries) and ACL
actions are stored in the KVDL.
Currently, all allocations are protected by RTNL, but this is going to
change with the removal of RTNL from the routing code.
Therefore, protect KVDL allocations with a lock. A mutex is used since
the free operation can block in Spectrum-2.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
TNODE_KMALLOC_MAX and VERSION are not used, so remove them
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
this macro is never used, so remove it
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.
To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.
net/openvswitch/flow_netlink.c: In function ‘validate_set’:
net/openvswitch/flow_netlink.c:2711:29: warning: statement will never be executed [-Wswitch-unreachable]
2711 | const struct ovs_key_ipv4 *ipv4_key;
| ^~~~~~~~
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.
To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.
net/ipv6/ip6_gre.c: In function ‘ip6gre_err’:
net/ipv6/ip6_gre.c:440:32: warning: statement will never be executed [-Wswitch-unreachable]
440 | struct ipv6_tlv_tnl_enc_lim *tel;
| ^~~
net/ipv6/ip6_tunnel.c: In function ‘ip6_tnl_err’:
net/ipv6/ip6_tunnel.c:520:32: warning: statement will never be executed [-Wswitch-unreachable]
520 | struct ipv6_tlv_tnl_enc_lim *tel;
| ^~~
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.
To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.
net/core/skbuff.c: In function ‘skb_checksum_setup_ip’:
net/core/skbuff.c:4809:7: warning: statement will never be executed [-Wswitch-unreachable]
4809 | int err;
| ^~~
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
arch/x86/kvm/emulate.c: In function 'x86_emulate_insn':
arch/x86/kvm/emulate.c:5686:22: error: cast between incompatible
function types from 'int (*)(struct x86_emulate_ctxt *)' to 'void
(*)(struct fastop *)' [-Werror=cast-function-type]
rc = fastop(ctxt, (fastop_t)ctxt->execute);
Fix it by using an unnamed union of a (*execute) function pointer and a
(*fastop) function pointer.
Fixes: 3009afc6e39e ("KVM: x86: Use a typedef for fastop functions")
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The "u" field in the event has three states, -1/0/1. Using u8 however means that
comparison with -1 will always fail, so change to signed char.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.
To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.
arch/x86/xen/enlighten_pv.c: In function ‘xen_write_msr_safe’:
arch/x86/xen/enlighten_pv.c:904:12: warning: statement will never be executed [-Wswitch-unreachable]
904 | unsigned which;
| ^~~~~
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20200220062318.69299-1-keescook@chromium.org
Reviewed-by: Juergen Gross <jgross@suse.com>
[boris: made @which an 'unsigned int']
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
|
|
Currently if you build with O=... the rseq tests don't build:
$ make O=$PWD/output -C tools/testing/selftests/ TARGETS=rseq
make: Entering directory '/linux/tools/testing/selftests'
...
make[1]: Entering directory '/linux/tools/testing/selftests/rseq'
gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ -shared -fPIC rseq.c -lpthread -o /linux/output/rseq/librseq.so
gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ basic_test.c -lpthread -lrseq -o /linux/output/rseq/basic_test
/usr/bin/ld: cannot find -lrseq
collect2: error: ld returned 1 exit status
This is because the library search path points to the source
directory, not the output.
We can fix it by changing the library search path to $(OUTPUT).
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Commit 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second
timeout per test") added a 45 second timeout for tests, and also added
a way for tests to customise the timeout via a settings file.
For example the ftrace tests take multiple minutes to run, so they
were given longer in commit b43e78f65b1d ("tracing/selftests: Turn off
timeout setting").
This works when the tests are run from the source tree. However if the
tests are installed with "make -C tools/testing/selftests install",
the settings files are not copied into the install directory. When the
tests are then run from the install directory the longer timeouts are
not applied and the tests timeout incorrectly.
So add the settings files to TEST_FILES of the appropriate Makefiles
to cause the settings files to be installed using the existing install
logic.
Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Commit 46d1a0f03d66 ("selftests/lkdtm: Add tests for LKDTM targets")
added generation of lkdtm test scripts.
Ignore those generated scripts when performing 'git status'
Fixes: 46d1a0f03d66 ("selftests/lkdtm: Add tests for LKDTM targets")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Currently the arm64 kernel ignores the top address byte passed to brk(),
mmap() and mremap(). When the user is not aware of the 56-bit address
limit or relies on the kernel to return an error, untagging such
pointers has the potential to create address aliases in user-space.
Passing a tagged address to munmap(), madvise() is permitted since the
tagged pointer is expected to be inside an existing mapping.
The current behaviour breaks the existing glibc malloc() implementation
which relies on brk() with an address beyond 56-bit to be rejected by
the kernel.
Remove untagging in the above functions by partially reverting commit
ce18d171cb73 ("mm: untag user pointers in mmap/munmap/mremap/brk"). In
addition, update the arm64 tagged-address-abi.rst document accordingly.
Link: https://bugzilla.redhat.com/1797052
Fixes: ce18d171cb73 ("mm: untag user pointers in mmap/munmap/mremap/brk")
Cc: <stable@vger.kernel.org> # 5.4.x-
Cc: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Victor Stinner <vstinner@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Fix trivial spelling error enought to enough in memory.rst.
Cc: trivial@kernel.org
Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
We tested a soft lockup problem in linux 4.19 which could also
be found in linux 5.x.
When dir inode takes up a large number of blocks, and if the
directory is growing when we are searching, it's possible the
restart branch could be called many times, and the do while loop
could hold cpu a long time.
Here is the call trace in linux 4.19.
[ 473.756186] Call trace:
[ 473.756196] dump_backtrace+0x0/0x198
[ 473.756199] show_stack+0x24/0x30
[ 473.756205] dump_stack+0xa4/0xcc
[ 473.756210] watchdog_timer_fn+0x300/0x3e8
[ 473.756215] __hrtimer_run_queues+0x114/0x358
[ 473.756217] hrtimer_interrupt+0x104/0x2d8
[ 473.756222] arch_timer_handler_virt+0x38/0x58
[ 473.756226] handle_percpu_devid_irq+0x90/0x248
[ 473.756231] generic_handle_irq+0x34/0x50
[ 473.756234] __handle_domain_irq+0x68/0xc0
[ 473.756236] gic_handle_irq+0x6c/0x150
[ 473.756238] el1_irq+0xb8/0x140
[ 473.756286] ext4_es_lookup_extent+0xdc/0x258 [ext4]
[ 473.756310] ext4_map_blocks+0x64/0x5c0 [ext4]
[ 473.756333] ext4_getblk+0x6c/0x1d0 [ext4]
[ 473.756356] ext4_bread_batch+0x7c/0x1f8 [ext4]
[ 473.756379] ext4_find_entry+0x124/0x3f8 [ext4]
[ 473.756402] ext4_lookup+0x8c/0x258 [ext4]
[ 473.756407] __lookup_hash+0x8c/0xe8
[ 473.756411] filename_create+0xa0/0x170
[ 473.756413] do_mkdirat+0x6c/0x140
[ 473.756415] __arm64_sys_mkdirat+0x28/0x38
[ 473.756419] el0_svc_common+0x78/0x130
[ 473.756421] el0_svc_handler+0x38/0x78
[ 473.756423] el0_svc+0x8/0xc
[ 485.755156] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [tmp:5149]
Add cond_resched() to avoid soft lockup and to provide a better
system responding.
Link: https://lore.kernel.org/r/20200215080206.13293-1-luoshijie1@huawei.com
Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
|
|
EXT4_I(inode)->i_disksize could be accessed concurrently as noticed by
KCSAN,
BUG: KCSAN: data-race in ext4_write_end [ext4] / ext4_writepages [ext4]
write to 0xffff91c6713b00f8 of 8 bytes by task 49268 on cpu 127:
ext4_write_end+0x4e3/0x750 [ext4]
ext4_update_i_disksize at fs/ext4/ext4.h:3032
(inlined by) ext4_update_inode_size at fs/ext4/ext4.h:3046
(inlined by) ext4_write_end at fs/ext4/inode.c:1287
generic_perform_write+0x208/0x2a0
ext4_buffered_write_iter+0x11f/0x210 [ext4]
ext4_file_write_iter+0xce/0x9e0 [ext4]
new_sync_write+0x29c/0x3b0
__vfs_write+0x92/0xa0
vfs_write+0x103/0x260
ksys_write+0x9d/0x130
__x64_sys_write+0x4c/0x60
do_syscall_64+0x91/0xb47
entry_SYSCALL_64_after_hwframe+0x49/0xbe
read to 0xffff91c6713b00f8 of 8 bytes by task 24872 on cpu 37:
ext4_writepages+0x10ac/0x1d00 [ext4]
mpage_map_and_submit_extent at fs/ext4/inode.c:2468
(inlined by) ext4_writepages at fs/ext4/inode.c:2772
do_writepages+0x5e/0x130
__writeback_single_inode+0xeb/0xb20
writeback_sb_inodes+0x429/0x900
__writeback_inodes_wb+0xc4/0x150
wb_writeback+0x4bd/0x870
wb_workfn+0x6b4/0x960
process_one_work+0x54c/0xbe0
worker_thread+0x80/0x650
kthread+0x1e0/0x200
ret_from_fork+0x27/0x50
Reported by Kernel Concurrency Sanitizer on:
CPU: 37 PID: 24872 Comm: kworker/u261:2 Tainted: G W O L 5.5.0-next-20200204+ #5
Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
Workqueue: writeback wb_workfn (flush-7:0)
Since only the read is operating as lockless (outside of the
"i_data_sem"), load tearing could introduce a logic bug. Fix it by
adding READ_ONCE() for the read and WRITE_ONCE() for the write.
Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/1581085751-31793-1-git-send-email-cai@lca.pw
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
|
|
Nothing major here, another TU1xx modesetting fix, and hooking up
ACR/GR support on TU11x now that NVIDIA have made the firmware
available.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CACAvsv64yBq4KHJ8D-5HQ5eeotApJSMiD+V2ut4f3BonUggf0Q@mail.gmail.com
|
|
Damien Le Moal reports a lockdep splat with the acpi_power_meter,
observed with Linux v5.5 and later.
======================================================
WARNING: possible circular locking dependency detected
5.6.0-rc2+ #629 Not tainted
------------------------------------------------------
python/1397 is trying to acquire lock:
ffff888619080070 (&resource->lock){+.+.}, at: show_power+0x3c/0xa0 [acpi_power_meter]
but task is already holding lock:
ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (kn->count#119){++++}:
__kernfs_remove+0x626/0x7e0
kernfs_remove_by_name_ns+0x41/0x80
remove_attrs+0xcb/0x3c0 [acpi_power_meter]
acpi_power_meter_notify+0x1f7/0x310 [acpi_power_meter]
acpi_ev_notify_dispatch+0x198/0x1f3
acpi_os_execute_deferred+0x4d/0x70
process_one_work+0x7c8/0x1340
worker_thread+0x94/0xc70
kthread+0x2ed/0x3f0
ret_from_fork+0x24/0x30
-> #0 (&resource->lock){+.+.}:
__lock_acquire+0x20be/0x49b0
lock_acquire+0x127/0x340
__mutex_lock+0x15b/0x1350
show_power+0x3c/0xa0 [acpi_power_meter]
dev_attr_show+0x3f/0x80
sysfs_kf_seq_show+0x216/0x410
seq_read+0x407/0xf90
vfs_read+0x152/0x2c0
ksys_read+0xf3/0x1d0
do_syscall_64+0x95/0x1010
entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(kn->count#119);
lock(&resource->lock);
lock(kn->count#119);
lock(&resource->lock);
*** DEADLOCK ***
4 locks held by python/1397:
#0: ffff8890242d64e0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x9b/0xb0
#1: ffff889040be74e0 (&p->lock){+.+.}, at: seq_read+0x6b/0xf90
#2: ffff8890448eb880 (&of->mutex){+.+.}, at: kernfs_seq_start+0x47/0x160
#3: ffff88881643f188 (kn->count#119){++++}, at: kernfs_seq_start+0x6a/0x160
stack backtrace:
CPU: 10 PID: 1397 Comm: python Not tainted 5.6.0-rc2+ #629
Hardware name: Supermicro Super Server/X11DPL-i, BIOS 3.1 05/21/2019
Call Trace:
dump_stack+0x97/0xe0
check_noncircular+0x32e/0x3e0
? print_circular_bug.isra.0+0x1e0/0x1e0
? unwind_next_frame+0xb9a/0x1890
? entry_SYSCALL_64_after_hwframe+0x49/0xbe
? graph_lock+0x79/0x170
? __lockdep_reset_lock+0x3c0/0x3c0
? mark_lock+0xbc/0x1150
__lock_acquire+0x20be/0x49b0
? mark_held_locks+0xe0/0xe0
? stack_trace_save+0x91/0xc0
lock_acquire+0x127/0x340
? show_power+0x3c/0xa0 [acpi_power_meter]
? device_remove_bin_file+0x10/0x10
? device_remove_bin_file+0x10/0x10
__mutex_lock+0x15b/0x1350
? show_power+0x3c/0xa0 [acpi_power_meter]
? show_power+0x3c/0xa0 [acpi_power_meter]
? mutex_lock_io_nested+0x11f0/0x11f0
? lock_downgrade+0x6a0/0x6a0
? kernfs_seq_start+0x47/0x160
? lock_acquire+0x127/0x340
? kernfs_seq_start+0x6a/0x160
? device_remove_bin_file+0x10/0x10
? show_power+0x3c/0xa0 [acpi_power_meter]
show_power+0x3c/0xa0 [acpi_power_meter]
dev_attr_show+0x3f/0x80
? memset+0x20/0x40
sysfs_kf_seq_show+0x216/0x410
seq_read+0x407/0xf90
? security_file_permission+0x16f/0x2c0
vfs_read+0x152/0x2c0
Problem is that reading an attribute takes the kernfs lock in the kernfs
code, then resource->lock in the driver. During an ACPI notification, the
opposite happens: The resource lock is taken first, followed by the kernfs
lock when sysfs attributes are removed and re-created. Presumably this is
now seen due to some locking related changes in kernfs after v5.4, but it
was likely always a problem.
Fix the problem by not blindly acquiring the lock in the notification
function. It is only needed to protect the various update functions.
However, those update functions are called anyway when sysfs attributes
are read. This means that we can just stop calling those functions from
the notifier, and the resource lock in the notifier function is no longer
needed.
That leaves two situations:
First, METER_NOTIFY_CONFIG removes and re-allocates capability strings.
While it did so under the resource lock, _displaying_ those strings was not
protected, creating a race condition. To solve this problem, selectively
protect both removal/creation and reporting of capability attributes with
the resource lock.
Second, removing and re-creating the attribute files is no longer protected
by the resource lock. That doesn't matter since access to each individual
attribute is protected by the kernfs lock. Userspace may get messed up if
attributes disappear and reappear under its nose, but that is not different
than today, and there is nothing we can do about it without major driver
restructuring.
Last but not least, when removing the driver, remove attribute functions
first, then release capability strings. This avoids yet another race
condition.
Reported-by: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: stable@vger.kernel.org # v5.5+
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
"Fixes to build failures and other test bugs"
* tag 'linux-kselftest-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: openat2: fix build error on newer glibc
selftests: use LDLIBS for libraries instead of LDFLAGS
selftests: fix too long argument
selftests: allow detection of build failures
Kernel selftests: tpm2: check for tpm support
selftests/ftrace: Have pid filter test use instance flag
selftests: fix spelling mistaked "chaigned" -> "chained"
|
|
Commit 1de243b07666 ("media: dt-bindings: media: sun4i-csi: Add compatible
for CSI1 on A10/A20") introduced support for the CSI1 controller on A10 and
A20 that unlike CSI0 doesn't have an ISP and therefore only have two
clocks, the bus and module clocks.
The clocks and clock-names properties have thus been modified to allow
either two or tree clocks. However, the current list has the ISP clock at
the second position, which means the bindings expects a list of either
bus and isp, or bus, isp and mod. The initial intent of the patch was
obviously to have bus and mod in the former case.
Let's fix the binding so that it validates properly.
Fixes: 1de243b07666 ("media: dt-bindings: media: sun4i-csi: Add compatible for CSI1 on A10/A20")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The Allwinner CSI controller is sitting beside the MBUS that is represented
as an interconnect.
Make sure that the interconnect properties are valid in the binding.
Fixes: 7866d6903ce8 ("media: dt-bindings: media: sun4i-csi: Add compatible for CSI0 on R40")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: Rob Herring <robh@kernel.org>
|