summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-02-25selftests/bpf: Run SYN cookies with reuseport BPF test only for TCPJakub Sitnicki1-8/+9
Currently we run SYN cookies test for all socket types and mark the test as skipped if socket type is not compatible. This causes confusion because skipped test might indicate a problem with the testing environment. Instead, run the test only for the socket type which supports SYN cookies. Also, switch to using designated initializers when setting up tests, so that we can tweak only some test parameters, leaving the rest initialized to default values. Fixes: eecd618b4516 ("selftests/bpf: Mark SYN cookie test skipped for UDP sockets") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200224135327.121542-2-jakub@cloudflare.com
2020-02-25selftests/bpf: Run reuseport tests only with supported socket typesJakub Sitnicki1-7/+6
SOCKMAP and SOCKHASH map types can be used with reuseport BPF programs but don't support yet storing UDP sockets. Instead of marking UDP tests with SOCK{MAP,HASH} as skipped, don't run them at all. Skipped test might signal that the test environment is not suitable for running the test, while in reality the functionality is not implemented in the kernel yet. Before: sh# ./test_progs -t select_reuseport … #40 select_reuseport:OK Summary: 1/126 PASSED, 30 SKIPPED, 0 FAILED After: sh# ./test_progs -t select_reuseport … #40 select_reuseport:OK Summary: 1/98 PASSED, 2 SKIPPED, 0 FAILED The remaining two skipped tests are SYN cookies tests, which will be addressed in the subsequent patch. Fixes: 11318ba8cafd ("selftests/bpf: Extend SK_REUSEPORT tests to cover SOCKMAP/SOCKHASH") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200224135327.121542-1-jakub@cloudflare.com
2020-02-25Merge branch 'BPF_and_RT'Alexei Starovoitov18-142/+283
Thomas Gleixner says: ==================== This is the third version of the BPF/RT patch set which makes both coexist nicely. The long explanation can be found in the cover letter of the V1 submission: https://lore.kernel.org/r/20200214133917.304937432@linutronix.de V2 is here: https://lore.kernel.org/r/20200220204517.863202864@linutronix.de The following changes vs. V2 have been made: - Rebased to bpf-next, adjusted to the lock changes in the hashmap code. - Split the preallocation enforcement patch for instrumentation type BPF programs into two pieces: 1) Emit a one-time warning on !RT kernels when any instrumentation type BPF program uses run-time allocation. Emit also a corresponding warning in the verifier log. But allow the program to run for backward compatibility sake. After a grace period this should be enforced. 2) On RT reject such programs because on RT the memory allocator cannot be called from truly atomic contexts. - Fixed the fallout from V2 as reported by Alexei and 0-day - Removed the redundant preempt_disable() from trace_call_bpf() - Removed the unused export of trace_call_bpf() ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-02-25bpf/stackmap: Dont trylock mmap_sem with PREEMPT_RT and interrupts disabledDavid Miller1-3/+15
In a RT kernel down_read_trylock() cannot be used from NMI context and up_read_non_owner() is another problematic issue. So in such a configuration, simply elide the annotated stackmap and just report the raw IPs. In the longer term, it might be possible to provide a atomic friendly versions of the page cache traversal which will at least provide the info if the pages are resident and don't need to be paged in. [ tglx: Use IS_ENABLED() to avoid the #ifdeffery, fixup the irq work callback and add a comment ] Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145644.708960317@linutronix.de
2020-02-25bpf, lpm: Make locking RT friendlyThomas Gleixner1-6/+6
The LPM trie map cannot be used in contexts like perf, kprobes and tracing as this map type dynamically allocates memory. The memory allocation happens with a raw spinlock held which is a truly spinning lock on a PREEMPT RT enabled kernel which disables preemption and interrupts. As RT does not allow memory allocation from such a section for various reasons, convert the raw spinlock to a regular spinlock. On a RT enabled kernel these locks are substituted by 'sleeping' spinlocks which provide the proper protection but keep the code preemptible. On a non-RT kernel regular spinlocks map to raw spinlocks, i.e. this does not cause any functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145644.602129531@linutronix.de
2020-02-25bpf: Prepare hashtab locking for PREEMPT_RTThomas Gleixner1-9/+56
PREEMPT_RT forbids certain operations like memory allocations (even with GFP_ATOMIC) from atomic contexts. This is required because even with GFP_ATOMIC the memory allocator calls into code pathes which acquire locks with long held lock sections. To ensure the deterministic behaviour these locks are regular spinlocks, which are converted to 'sleepable' spinlocks on RT. The only true atomic contexts on an RT kernel are the low level hardware handling, scheduling, low level interrupt handling, NMIs etc. None of these contexts should ever do memory allocations. As regular device interrupt handlers and soft interrupts are forced into thread context, the existing code which does spin_lock*(); alloc(GPF_ATOMIC); spin_unlock*(); just works. In theory the BPF locks could be converted to regular spinlocks as well, but the bucket locks and percpu_freelist locks can be taken from arbitrary contexts (perf, kprobes, tracepoints) which are required to be atomic contexts even on RT. These mechanisms require preallocated maps, so there is no need to invoke memory allocations within the lock held sections. BPF maps which need dynamic allocation are only used from (forced) thread context on RT and can therefore use regular spinlocks which in turn allows to invoke memory allocations from the lock held section. To achieve this make the hash bucket lock a union of a raw and a regular spinlock and initialize and lock/unlock either the raw spinlock for preallocated maps or the regular variant for maps which require memory allocations. On a non RT kernel this distinction is neither possible nor required. spinlock maps to raw_spinlock and the extra code and conditional is optimized out by the compiler. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145644.509685912@linutronix.de
2020-02-25bpf: Factor out hashtab bucket lock operationsThomas Gleixner1-23/+46
As a preparation for making the BPF locking RT friendly, factor out the hash bucket lock operations into inline functions. This allows to do the necessary RT modification in one place instead of sprinkling it all over the place. No functional change. The now unused htab argument of the lock/unlock functions will be used in the next step which adds PREEMPT_RT support. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145644.420416916@linutronix.de
2020-02-25bpf: Replace open coded recursion prevention in sys_bpf()Thomas Gleixner1-19/+8
The required protection is that the caller cannot be migrated to a different CPU as these functions end up in places which take either a hash bucket lock or might trigger a kprobe inside the memory allocator. Both scenarios can lead to deadlocks. The deadlock prevention is per CPU by incrementing a per CPU variable which temporarily blocks the invocation of BPF programs from perf and kprobes. Replace the open coded preempt_[dis|en]able and __this_cpu_[inc|dec] pairs with the new helper functions. These functions are already prepared to make BPF work on PREEMPT_RT enabled kernels. No functional change for !RT kernels. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145644.317843926@linutronix.de
2020-02-25bpf: Use recursion prevention helpers in hashtab codeThomas Gleixner1-8/+4
The required protection is that the caller cannot be migrated to a different CPU as these places take either a hash bucket lock or might trigger a kprobe inside the memory allocator. Both scenarios can lead to deadlocks. The deadlock prevention is per CPU by incrementing a per CPU variable which temporarily blocks the invocation of BPF programs from perf and kprobes. Replace the open coded preempt_disable/enable() and this_cpu_inc/dec() pairs with the new recursion prevention helpers to prepare BPF to work on PREEMPT_RT enabled kernels. On a non-RT kernel the migrate disable/enable in the helpers map to preempt_disable/enable(), i.e. no functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145644.211208533@linutronix.de
2020-02-25bpf: Provide recursion prevention helpersThomas Gleixner1-0/+30
The places which need to prevent the execution of trace type BPF programs to prevent deadlocks on the hash bucket lock do this open coded. Provide two inline functions, bpf_disable/enable_instrumentation() to replace these open coded protection constructs. Use migrate_disable/enable() instead of preempt_disable/enable() right away so this works on RT enabled kernels. On a !RT kernel migrate_disable / enable() are mapped to preempt_disable/enable(). These helpers use this_cpu_inc/dec() instead of __this_cpu_inc/dec() on an RT enabled kernel because migrate disabled regions are preemptible and preemption might hit in the middle of a RMW operation which can lead to inconsistent state. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145644.103910133@linutronix.de
2020-02-25bpf: Use migrate_disable/enable in array macros and cgroup/lirc code.David Miller2-6/+7
Replace the preemption disable/enable with migrate_disable/enable() to reflect the actual requirement and to allow PREEMPT_RT to substitute it with an actual migration disable mechanism which does not disable preemption. Including the code paths that go via __bpf_prog_run_save_cb(). Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.998293311@linutronix.de
2020-02-25bpf: Use migrate_disable/enabe() in trampoline code.David Miller1-4/+5
Instead of preemption disable/enable to reflect the purpose. This allows PREEMPT_RT to substitute it with an actual migration disable implementation. On non RT kernels this is still mapped to preempt_disable/enable(). Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.891428873@linutronix.de
2020-02-25bpf/tests: Use migrate disable instead of preempt disableDavid Miller2-6/+6
Replace the preemption disable/enable with migrate_disable/enable() to reflect the actual requirement and to allow PREEMPT_RT to substitute it with an actual migration disable mechanism which does not disable preemption. [ tglx: Switched it over to migrate disable ] Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.785306549@linutronix.de
2020-02-25bpf: Use bpf_prog_run_pin_on_cpu() at simple call sites.David Miller5-18/+6
All of these cases are strictly of the form: preempt_disable(); BPF_PROG_RUN(...); preempt_enable(); Replace this with bpf_prog_run_pin_on_cpu() which wraps BPF_PROG_RUN() with: migrate_disable(); BPF_PROG_RUN(...); migrate_enable(); On non RT enabled kernels this maps to preempt_disable/enable() and on RT enabled kernels this solely prevents migration, which is sufficient as there is no requirement to prevent reentrancy to any BPF program from a preempting task. The only requirement is that the program stays on the same CPU. Therefore, this is a trivially correct transformation. The seccomp loop does not need protection over the loop. It only needs protection per BPF filter program [ tglx: Converted to bpf_prog_run_pin_on_cpu() ] Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.691493094@linutronix.de
2020-02-25bpf: Replace cant_sleep() with cant_migrate()Thomas Gleixner1-1/+1
As already discussed in the previous change which introduced BPF_RUN_PROG_PIN_ON_CPU() BPF only requires to disable migration to guarantee per CPUness. If RT substitutes the preempt disable based migration protection then the cant_sleep() check will obviously trigger as preemption is not disabled. Replace it by cant_migrate() which maps to cant_sleep() on a non RT kernel and will verify that migration is disabled on a full RT kernel. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.583038889@linutronix.de
2020-02-25bpf: Provide bpf_prog_run_pin_on_cpu() helperThomas Gleixner1-2/+24
BPF programs require to run on one CPU to completion as they use per CPU storage, but according to Alexei they don't need reentrancy protection as obviously BPF programs running in thread context can always be 'preempted' by hard and soft interrupts and instrumentation and the same program can run concurrently on a different CPU. The currently used mechanism to ensure CPUness is to wrap the invocation into a preempt_disable/enable() pair. Disabling preemption is also disabling migration for a task. preempt_disable/enable() is used because there is no explicit way to reliably disable only migration. Provide a separate macro to invoke a BPF program which can be used in migrateable task context. It wraps BPF_PROG_RUN() in a migrate_disable/enable() pair which maps on non RT enabled kernels to preempt_disable/enable(). On RT enabled kernels this merely disables migration. Both methods ensure that the invoked BPF program runs on one CPU to completion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.474592620@linutronix.de
2020-02-25bpf: Dont iterate over possible CPUs with interrupts disabledThomas Gleixner1-10/+10
pcpu_freelist_populate() is disabling interrupts and then iterates over the possible CPUs. The reason why this disables interrupts is to silence lockdep because the invoked ___pcpu_freelist_push() takes spin locks. Neither the interrupt disabling nor the locking are required in this function because it's called during initialization and the resulting map is not yet visible to anything. Split out the actual push assignement into an inline, call it from the loop and remove the interrupt disable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.365930116@linutronix.de
2020-02-25bpf: Remove recursion prevention from rcu free callbackThomas Gleixner1-8/+0
If an element is freed via RCU then recursion into BPF instrumentation functions is not a concern. The element is already detached from the map and the RCU callback does not hold any locks on which a kprobe, perf event or tracepoint attached BPF program could deadlock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.259118710@linutronix.de
2020-02-25perf/bpf: Remove preempt disable around BPF invocationThomas Gleixner1-2/+0
The BPF invocation from the perf event overflow handler does not require to disable preemption because this is called from NMI or at least hard interrupt context which is already non-preemptible. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.151953573@linutronix.de
2020-02-25bpf/trace: Remove redundant preempt_disable from trace_call_bpf()Thomas Gleixner1-2/+1
Similar to __bpf_trace_run this is redundant because __bpf_trace_run() is invoked from a trace point via __DO_TRACE() which already disables preemption _before_ invoking any of the functions which are attached to a trace point. Remove it and add a cant_sleep() check. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.059995527@linutronix.de
2020-02-25bpf: disable preemption for bpf progs attached to uprobeAlexei Starovoitov1-2/+9
trace_call_bpf() no longer disables preemption on its own. All callers of this function has to do it explicitly. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2020-02-25bpf/trace: Remove EXPORT from trace_call_bpf()Thomas Gleixner1-1/+0
All callers are built in. No point to export this. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-02-25bpf/tracing: Remove redundant preempt_disable() in __bpf_trace_run()Thomas Gleixner1-2/+1
__bpf_trace_run() disables preemption around the BPF_PROG_RUN() invocation. This is redundant because __bpf_trace_run() is invoked from a trace point via __DO_TRACE() which already disables preemption _before_ invoking any of the functions which are attached to a trace point. Remove it and add a cant_sleep() check. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145642.847220186@linutronix.de
2020-02-25bpf: Update locking comment in hashtab codeThomas Gleixner1-4/+20
The comment where the bucket lock is acquired says: /* bpf_map_update_elem() can be called in_irq() */ which is not really helpful and aside of that it does not explain the subtle details of the hash bucket locks expecially in the context of BPF and perf, kprobes and tracing. Add a comment at the top of the file which explains the protection scopes and the details how potential deadlocks are prevented. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145642.755793061@linutronix.de
2020-02-25bpf: Enforce preallocation for instrumentation programs on RTThomas Gleixner1-4/+9
Aside of the general unsafety of run-time map allocation for instrumentation type programs RT enabled kernels have another constraint: The instrumentation programs are invoked with preemption disabled, but the memory allocator spinlocks cannot be acquired in atomic context because they are converted to 'sleeping' spinlocks on RT. Therefore enforce map preallocation for these programs types when RT is enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145642.648784007@linutronix.de
2020-02-25bpf: Tighten the requirements for preallocated hash mapsThomas Gleixner1-11/+28
The assumption that only programs attached to perf NMI events can deadlock on memory allocators is wrong. Assume the following simplified callchain: kmalloc() from regular non BPF context cache empty freelist empty lock(zone->lock); tracepoint or kprobe BPF() update_elem() lock(bucket) kmalloc() cache empty freelist empty lock(zone->lock); <- DEADLOCK There are other ways which do not involve locking to create wreckage: kmalloc() from regular non BPF context local_irq_save(); ... obj = slab_first(); kprobe() BPF() update_elem() lock(bucket) kmalloc() local_irq_save(); ... obj = slab_first(); <- Same object as above ... So preallocation _must_ be enforced for all variants of intrusive instrumentation. Unfortunately immediate enforcement would break backwards compatibility, so for now such programs still are allowed to run, but a one time warning is emitted in dmesg and the verifier emits a warning in the verifier log as well so developers are made aware about this and can fix their programs before the enforcement becomes mandatory. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145642.540542802@linutronix.de
2020-02-25Merge tag 'mac80211-for-net-2020-02-24' of ↵David S. Miller4-9/+6
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg ==================== A few fixes: * remove a double mutex-unlock * fix a leak in an error path * NULL pointer check * include if_vlan.h where needed * avoid RCU list traversal when not under RCU ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25Merge tag 'mac80211-next-for-net-next-2020-02-24' of ↵David S. Miller24-150/+909
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== A new set of changes: * lots of small documentation fixes, from Jérôme Pouiller * beacon protection (BIGTK) support from Jouni Malinen * some initial code for TID configuration, from Tamizh chelvam * I reverted some new API before it's actually used, because it's wrong to mix controlled port and preauth * a few other cleanups/fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: qrtr: fix spelling mistake "serivce" -> "service"Colin Ian King1-1/+1
There is a spelling mistake in a pr_err message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: ethernet: stmmac: don't warn about missing optional wakeup IRQAhmad Fatoum1-1/+1
The "stm32_pwr_wakeup" is optional per the binding and the driver handles its absence gracefully. Request it with platform_get_irq_byname_optional, so its absence doesn't needlessly clutter the log. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: ethernet: stmmac: demote warnings about missing optional clocksAhmad Fatoum2-2/+2
The specification of a "eth-ck" and a "ptp_ref" clock is optional per the binding and the driver handles them gracefully. Demote the output to an info message accordingly. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25Merge branch 'Add-ACPI-bindings-to-the-genet'David S. Miller4-42/+119
Jeremy Linton says: ==================== Add ACPI bindings to the genet This patch series allows the BCM GENET, as used on the RPi4, to attach when booted in an ACPI environment. The DSDT entry to trigger this is seen below. Of note, the first patch adds a small extension to the mdio layer which allows drivers to find the mii_bus without firmware assistance. The fifth patch in the set retrieves the MAC address from the umac registers rather than carrying it directly in the DSDT. This of course requires the firmware to pre-program it, so we continue to fall back on a random one if it appears to be garbage. v1 -> v2: fail on missing phy-mode property replace phy-mode internal property read string with device_get_phy_mode() equivalent rework mac address detection logic so that it merges the acpi/DT case into device_get_mac_address() allowing _DSD mac address properties. some commit messages justifying why phy_find_first() isn't the worst choice for this driver. + Device (ETH0) + { + Name (_HID, "BCM6E4E") + Name (_UID, 0) + Name (_CCA, 0x0) + Method (_STA) + { + Return (0xf) + } + Method (_CRS, 0x0, Serialized) + { + Name (RBUF, ResourceTemplate () + { + Memory32Fixed (ReadWrite, 0xFd580000, 0x10000, ) + Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive) { 0xBD } + Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive) { 0xBE } + }) + Return (RBUF) + } + Name (_DSD, Package () { + ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"), + Package () { + Package () { "phy-mode", "rgmii-rxid" }, + } + }) + } ==================== Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: bcmgenet: reduce severity of missing clock warningsJeremy Linton1-3/+3
If one types "failed to get enet clock" or similar into google there are ~370k hits. The vast majority are people debugging problems unrelated to this adapter, or bragging about their rpi's. Further, the DT clock bindings here are optional. Given that its not a fatal situation with common DT based systems, lets reduce the severity so people aren't seeing failure messages in everyday operation. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: bcmgenet: Fetch MAC address from the adapterJeremy Linton1-12/+27
ARM/ACPI machines should utilize self describing hardware when possible. The MAC address on the BCMGENET can be read from the adapter if a full featured firmware has already programmed it. Lets try using the address already programmed, if it appears to be valid. It should be noted that while we move the macaddr logic below the clock and power logic in the driver, none of that code will ever be active in an ACPI environment as the device will be attached to the acpi power domain, and brought to full power with all clocks enabled immediately before the device probe routine is called. One side effect of the above tweak is that while its now possible to read the MAC address via _DSD properties, it should be avoided. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: bcmgenet: Initial bcmgenet ACPI supportJeremy Linton1-7/+14
The rpi4 is capable of booting in ACPI mode with the latest edk2-platform commits. As such it would be helpful if the genet platform device were usable. To achieve this we add a new MODULE_DEVICE_TABLE, and convert a few dt specific methods to their generic device_ calls. Until the next patch, ACPI based machines will fallback on random mac addresses. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: bcmgenet: enable automatic phy discoveryJeremy Linton1-6/+33
The unimac mdio driver falls back to scanning the entire bus if its given an appropriate mask. In ACPI mode we expect that the system is well behaved and conforms to recent versions of the specification. We then utilize phy_find_first(), and phy_connect_direct() to find and attach to the discovered phy during net_device open. While its apparently possible to build a genet based device with multiple phys on a single mdio bus, this works for current machines. Further, this driver makes a number of assumptions about the platform device, mac, mdio and phy all being 1:1. Lastly, It also avoids having to create references across the ACPI namespace hierarchy. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: bcmgenet: refactor phy mode configurationJeremy Linton1-16/+26
The DT phy mode is similar to what we want for ACPI lets factor it out of the of path, and change the of_ call to device_. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25mdio_bus: Add generic mdio_find_bus()Jeremy Linton2-0/+18
It appears most ethernet drivers follow one of two main strategies for mdio bus/phy management. A monolithic model where the net driver itself creates, probes and uses the phy, and one where an external mdio/phy driver instantiates the mdio bus/phy and the net driver only attaches to a known phy. Usually in this latter model the phys are discovered via DT relationships or simply phy name/address hardcoding. This is a shame because modern well behaved mdio buses are self describing and can be probed. The mdio layer itself is fully capable of this, yet there isn't a clean way for a standalone net driver to attach and enumerate the discovered devices. This is because outside of of_mdio_find_bus() there isn't a straightforward way to acquire the mii_bus pointer. So, lets add a mdio_find_bus which can return the mii_bus based only on its name. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25freescale: Replace zero-length array with flexible-array memberGustavo A. R. Silva3-3/+3
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25intel: Replace zero-length array with flexible-array memberGustavo A. R. Silva5-8/+8
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25netronome: Replace zero-length array with flexible-array memberGustavo A. R. Silva6-12/+12
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25toshiba: Replace zero-length array with flexible-array memberGustavo A. R. Silva4-4/+4
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25chelsio: Replace zero-length array with flexible-array memberGustavo A. R. Silva13-17/+17
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25Merge branch 'Remainder-for-DT-bindings-for-Felix-DSA-switch-on-LS1028A'David S. Miller3-4/+118
Vladimir Oltean says: ==================== Remainder for "DT bindings for Felix DSA switch on LS1028A" This series is the remainder of patchset [0] which has been merged through Shawn Guo's devicetree tree. It contains changes to the PHY mode validation in the Felix driver ("gmii" to "internal") and the documentation for the DT bindings. [0]: https://patchwork.ozlabs.org/cover/1242716/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25dt-bindings: net: dsa: ocelot: document the vsc9959 coreVladimir Oltean1-0/+116
This patch adds the required documentation for the embedded L2 switch inside the NXP LS1028A chip. I've submitted it in the legacy format instead of yaml schema, because DSA itself has not yet been converted to yaml, and this driver defines no custom bindings. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: dsa: felix: Use PHY_INTERFACE_MODE_INTERNAL instead of GMIIVladimir Oltean2-4/+2
phy-mode = "gmii" is confusing because it may mean that the port supports the 8-bit-wide parallel data interface pinout, which it doesn't. It may also be confusing because one of the "gmii" internal ports is actually overclocked to run at 2.5Gbps (even though, yes, as far as the switch MAC is concerned, it still thinks it's gigabit). So use the phy-mode = "internal" property to describe the internal ports inside the NXP LS1028A chip (the ones facing the ENETC). The change should be fine, because the device tree bindings document is yet to be introduced, and there are no stable DT blobs in use. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Michael Walle <michael@walle.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25bareudp: Fix uninitialized variable warnings.David S. Miller1-0/+2
drivers/net/bareudp.c: In function 'bareudp_xmit_skb': drivers/net/bareudp.c:346:9: warning: 'err' may be used uninitialized in this function [-Wmaybe-uninitialized] 346 | return err; | ^~~ drivers/net/bareudp.c: In function 'bareudp6_xmit_skb': drivers/net/bareudp.c:407:9: warning: 'err' may be used uninitialized in this function [-Wmaybe-uninitialized] 407 | return err; Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25audit: always check the netlink payload length in audit_receive_msg()Paul Moore1-19/+21
This patch ensures that we always check the netlink payload length in audit_receive_msg() before we take any action on the payload itself. Cc: stable@vger.kernel.org Reported-by: syzbot+399c44bf1f43b8747403@syzkaller.appspotmail.com Reported-by: syzbot+e4b12d8d202701f08b6d@syzkaller.appspotmail.com Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-02-25Merge branch 'Bare-UDP-L3-Encapsulation-Module'David S. Miller11-0/+1033
Martin Varghese says: ==================== Bare UDP L3 Encapsulation Module There are various L3 encapsulation standards using UDP being discussed to leverage the UDP based load balancing capability of different networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them. The Bareudp tunnel module provides a generic L3 encapsulation tunnelling support for tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP tunnel. Special Handling ---------------- The bareudp device supports special handling for MPLS & IP as they can have multiple ethertypes. MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). This special handling can be enabled only for ethertypes ETH_P_IP & ETH_P_MPLS_UC with a flag called multiproto mode. Usage ------ 1) Device creation & deletion a) ip link add dev bareudp0 type bareudp dstport 6635 ethertype 0x8847. This creates a bareudp tunnel device which tunnels L3 traffic with ethertype 0x8847 (MPLS traffic). The destination port of the UDP header will be set to 6635.The device will listen on UDP port 6635 to receive traffic. b) ip link delete bareudp0 2) Device creation with multiple proto mode enabled There are two ways to create a bareudp device for MPLS & IP with multiproto mode enabled. a) ip link add dev bareudp0 type bareudp dstport 6635 ethertype 0x8847 multiproto b) ip link add dev bareudp0 type bareudp dstport 6635 ethertype mpls 3) Device Usage The bareudp device could be used along with OVS or flower filter in TC. The OVS or TC flower layer must set the tunnel information in SKB dst field before sending packet buffer to the bareudp device for transmission. On reception the bareudp device extracts and stores the tunnel information in SKB dst field before passing the packet buffer to the network stack. Why not FOU ? ------------ FOU by design does l4 encapsulation.It maps udp port to ipproto (IP protocol number for l4 protocol). Bareudp acheives a generic l3 encapsulation.It maps udp port to l3 ethertype. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-25net: Special handling for IP & MPLS.Martin Varghese4-4/+85
Special handling is needed in bareudp module for IP & MPLS as they support more than one ethertypes. MPLS has 2 ethertypes. 0x8847 for MPLS unicast and 0x8848 for MPLS multicast. While decapsulating MPLS packet from UDP packet the tunnel destination IP address is checked to determine the ethertype. The ethertype of the packet will be set to 0x8848 if the tunnel destination IP address is a multicast IP address. The ethertype of the packet will be set to 0x8847 if the tunnel destination IP address is a unicast IP address. IP has 2 ethertypes.0x0800 for IPV4 and 0x86dd for IPv6. The version field of the IP header tunnelled will be checked to determine the ethertype. This special handling to tunnel additional ethertypes will be disabled by default and can be enabled using a flag called multiproto. This flag can be used only with ethertypes 0x8847 and 0x0800. Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>