summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-02-27net: cisco: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-5/+5
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] Lastly, fix the following checkpatch warning: CHECK: Prefer kernel type 'u32' over 'u_int32_t' #61: FILE: drivers/net/ethernet/cisco/enic/vnic_devcmd.h:653: + u_int32_t val[]; This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: marvell: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: hns: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27sfc: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Martin Habets <mhabets@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27qlogic: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-5/+5
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was detected with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27net: dsa: bcm_sf2: Forcibly configure IMP port for 1Gb/secFlorian Fainelli1-2/+1
We are still experiencing some packet loss with the existing advanced congestion buffering (ACB) settings with the IMP port configured for 2Gb/sec, so revert to conservative link speeds that do not produce packet loss until this is resolved. Fixes: 8f1880cbe8d0 ("net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec") Fixes: de34d7084edd ("net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27Revert "net: dsa: bcm_sf2: Also configure Port 5 for 2Gb/sec on 7278"Florian Fainelli2-4/+0
This reverts commit 7458bd540fa0a90220b9e8c349d910d9dde9caf8 ("net: dsa: bcm_sf2: Also configure Port 5 for 2Gb/sec on 7278") as it causes advanced congestion buffering issues with 7278 switch devices when using their internal Giabit PHY. While this is being debugged, continue with conservative defaults that work and do not cause packet loss. Fixes: 7458bd540fa0 ("net: dsa: bcm_sf2: Also configure Port 5 for 2Gb/sec on 7278") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller6-228/+529
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes: 1) Perform garbage collection from workqueue to fix rcu detected stall in ipset hash set types, from Jozsef Kadlecsik. 2) Fix the forceadd evaluation path, also from Jozsef. 3) Fix nft_set_pipapo selftest, from Stefano Brivio. 4) Crash when add-flush-add element in pipapo set, also from Stefano. Add test to cover this crash. 5) Remove sysctl entry under mutex in hashlimit, from Cong Wang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27Merge tag 'tag-chrome-platform-fixes-for-v5.6-rc4' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform fix from Benson Leung: "Fix a build warning" * tag 'tag-chrome-platform-fixes-for-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: platform/chrome: wilco_ec: Include asm/unaligned instead of linux/ path
2020-02-27bnxt_en: add newline to netdev_*() format stringsJonathan Lemon5-38/+38
Add missing newlines to netdev_* format strings so the lines aren't buffered by the printk subsystem. Nitpicked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27netfilter: xt_hashlimit: unregister proc file before releasing mutexCong Wang1-10/+6
Before releasing the global mutex, we only unlink the hashtable from the hash list, its proc file is still not unregistered at this point. So syzbot could trigger a race condition where a parallel htable_create() could register the same file immediately after the mutex is released. Move htable_remove_proc_entry() back to mutex protection to fix this. And, fold htable_destroy() into htable_put() to make the code slightly easier to understand. Reported-and-tested-by: syzbot+d195fd3b9a364ddd6731@syzkaller.appspotmail.com Fixes: c4a3922d2d20 ("netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()") Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-02-26ethtool: limit bitset sizeMichal Kubecek2-1/+4
Syzbot reported that ethnl_compact_sanity_checks() can be tricked into reading past the end of ETHTOOL_A_BITSET_VALUE and ETHTOOL_A_BITSET_MASK attributes and even the message by passing a value between (u32)(-31) and (u32)(-1) as ETHTOOL_A_BITSET_SIZE. The problem is that DIV_ROUND_UP(attr_nbits, 32) is 0 for such values so that zero length ETHTOOL_A_BITSET_VALUE will pass the length check but ethnl_bitmap32_not_zero() check would try to access up to 512 MB of attribute "payload". Prevent this overflow byt limiting the bitset size. Technically, compact bitset format would allow bitset sizes up to almost 2^18 (so that the nest size does not exceed U16_MAX) but bitsets used by ethtool are much shorter. S16_MAX, the largest value which can be directly used as an upper limit in policy, should be a reasonable compromise. Fixes: 10b518d4e6dd ("ethtool: netlink bitset handling") Reported-by: syzbot+7fd4ed5b4234ab1fdccd@syzkaller.appspotmail.com Reported-by: syzbot+709b7a64d57978247e44@syzkaller.appspotmail.com Reported-by: syzbot+983cb8fb2d17a7af549d@syzkaller.appspotmail.com Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26net: Fix Tx hash bound checkingAmritha Nambiar1-0/+2
Fixes the lower and upper bounds when there are multiple TCs and traffic is on the the same TC on the same device. The lower bound is represented by 'qoffset' and the upper limit for hash value is 'qcount + qoffset'. This gives a clean Rx to Tx queue mapping when there are multiple TCs, as the queue indices for upper TCs will be offset by 'qoffset'. v2: Fixed commit description based on comments. Fixes: 1b837d489e06 ("net: Revoke export for __skb_tx_hash, update it to just be static skb_tx_hash") Fixes: eadec877ce9c ("net: Add support for subordinate traffic classes to netdev_pick_tx") Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26Merge tag 'trace-v5.6-rc2' of ↵Linus Torvalds15-92/+272
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing and bootconfig updates: "Fixes and changes to bootconfig before it goes live in a release. Change in API of bootconfig (before it comes live in a release): - Have a magic value "BOOTCONFIG" in initrd to know a bootconfig exists - Set CONFIG_BOOT_CONFIG to 'n' by default - Show error if "bootconfig" on cmdline but not compiled in - Prevent redefining the same value - Have a way to append values - Added a SELECT BLK_DEV_INITRD to fix a build failure Synthetic event fixes: - Switch to raw_smp_processor_id() for recording CPU value in preempt section. (No care for what the value actually is) - Fix samples always recording u64 values - Fix endianess - Check number of values matches number of fields - Fix a printing bug Fix of trace_printk() breaking postponed start up tests Make a function static that is only used in a single file" * tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: bootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue bootconfig: Add append value operator support bootconfig: Prohibit re-defining value on same key bootconfig: Print array as multiple commands for legacy command line bootconfig: Reject subkey and value on same parent key tools/bootconfig: Remove unneeded error message silencer bootconfig: Add bootconfig magic word for indicating bootconfig explicitly bootconfig: Set CONFIG_BOOT_CONFIG=n by default tracing: Clear trace_state when starting trace bootconfig: Mark boot_config_checksum() static tracing: Disable trace_printk() on post poned tests tracing: Have synthetic event test use raw_smp_processor_id() tracing: Fix number printing bug in print_synth_event() tracing: Check that number of vals matches number of synth event fields tracing: Make synth_event trace functions endian-correct tracing: Make sure synth_event_trace() example always uses u64
2020-02-26Merge tag 'linux-kselftest-kunit-5.6-rc4' of ↵Linus Torvalds3-12/+29
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kunit fixes from Shuah Khan: "This Kselftest kunit update consists of fixes to documentation and the run-time tool from Brendan Higgins and Heidi Fahim" * tag 'linux-kselftest-kunit-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: run kunit_tool from any directory kunit: test: Improve error messages for kunit_tool when kunitconfig is invalid Documentation: kunit: fixed sphinx error in code block
2020-02-26Merge tag 'linux-kselftest-5.6-rc4' of ↵Linus Torvalds6-2/+14
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: - fixes to TIMEOUT failures and out-of-tree compilation compilation errors from Michael Ellerman. - declutter git status fix from Christophe Leroy * tag 'linux-kselftest-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/rseq: Fix out-of-tree compilation selftests: Install settings files to fix TIMEOUT failures selftest/lkdtm: Don't pollute 'git status'
2020-02-26Revert "KVM: x86: enable -Werror"Christoph Hellwig1-1/+0
This reverts commit ead68df94d248c80fdbae220ae5425eb5af2e753. Using the -Werror flag breaks the build for me due to mostly harmless KASAN or similar warnings: arch/x86/kvm/x86.c: In function ‘kvm_timer_init’: arch/x86/kvm/x86.c:7209:1: error: the frame size of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Feel free to add a CONFIG_WERROR if you care strong enough, but don't break peoples builds for absolutely no good reason. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-26signal: avoid double atomic counter increments for user accountingLinus Torvalds1-9/+14
When queueing a signal, we increment both the users count of pending signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount of the user struct itself (because we keep a reference to the user in the signal structure in order to correctly account for it when freeing). That turns out to be fairly expensive, because both of them are atomic updates, and particularly under extreme signal handling pressure on big machines, you can get a lot of cache contention on the user struct. That can then cause horrid cacheline ping-pong when you do these multiple accesses. So change the reference counting to only pin the user for the _first_ pending signal, and to unpin it when the last pending signal is dequeued. That means that when a user sees a lot of concurrent signal queuing - which is the only situation when this matters - the only atomic access needed is generally the 'sigpending' count update. This was noticed because of a particularly odd timing artifact on a dual-socket 96C/192T Cascade Lake platform: when you get into bad contention, on that machine for some reason seems to be much worse when the contention happens in the upper 32-byte half of the cacheline. As a result, the kernel test robot will-it-scale 'signal1' benchmark had an odd performance regression simply due to random alignment of the 'struct user_struct' (and pointed to a completely unrelated and apparently nonsensical commit for the regression). Avoiding the double increments (and decrements on the dequeueing side, of course) makes for much less contention and hugely improved performance on that will-it-scale microbenchmark. Quoting Feng Tang: "It makes a big difference, that the performance score is tripled! bump from original 17000 to 54000. Also the gap between 5.0-rc6 and 5.0-rc6+Jiri's patch is reduced to around 2%" [ The "2% gap" is the odd cacheline placement difference on that platform: under the extreme contention case, the effect of which half of the cacheline was hot was 5%, so with the reduced contention the odd timing artifact is reduced too ] It does help in the non-contended case too, but is not nearly as noticeable. Reported-and-tested-by: Feng Tang <feng.tang@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Philip Li <philip.li@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-26Merge branch 'bpf-bpftool-probes'Daniel Borkmann7-123/+373
Michal Rostecki says: ==================== Feature probes in bpftool related to bpf_probe_write_user and bpf_trace_printk helpers emit dmesg warnings which might be confusing for people running bpftool on production environments. This patch series addresses that by filtering them out by default and introducing the new positional argument "full" which enables all available probes. The main motivation behind those changes is ability the fact that some probes (for example those related to "trace" or "write_user" helpers) emit dmesg messages which might be confusing for people who are running on production environments. For details see the Cilium issue[0]. v1 -> v2: - Do not expose regex filters to users, keep filtering logic internal, expose only the "full" option for including probes which emit dmesg warnings. v2 -> v3: - Do not use regex for filtering out probes, use function IDs directly. - Fix bash completion - in v2 only "prefix" was proposed after "macros", "dev" and "kernel" were not. - Rephrase the man page paragraph, highlight helper function names. - Remove tests which parse the plain output of bpftool (except the header/macros test), focus on testing JSON output instead. - Add test which compares the output with and without "full" option. v3 -> v4: - Use enum to check for helper functions. - Make selftests compatible with older versions of Python 3.x than 3.7. [0] https://github.com/cilium/cilium/issues/10048 ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-02-26selftests/bpf: Add test for "bpftool feature" commandMichal Rostecki4-2/+189
Add Python module with tests for "bpftool feature" command, which mainly checks whether the "full" option is working properly. Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200226165941.6379-6-mrostecki@opensuse.org
2020-02-26bpftool: Update bash completion for "bpftool feature" commandMichal Rostecki1-1/+2
Update bash completion for "bpftool feature" command with the new argument: "full". Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200226165941.6379-5-mrostecki@opensuse.org
2020-02-26bpftool: Update documentation of "bpftool feature" commandMichal Rostecki1-9/+10
Update documentation of "bpftool feature" command with information about new arguments: "full". Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200226165941.6379-4-mrostecki@opensuse.org
2020-02-26bpftool: Make probes which emit dmesg warnings optionalMichal Rostecki1-21/+49
Probes related to bpf_probe_write_user and bpf_trace_printk helpers emit dmesg warnings which might be confusing for people running bpftool on production environments. This change filters them out by default and introduces the new positional argument "full" which enables all available probes. Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200226165941.6379-3-mrostecki@opensuse.org
2020-02-26bpftool: Move out sections to separate functionsMichal Rostecki1-93/+126
Remove all calls of print_end_then_start_section function and for loops out from the do_probe function. Instead, provide separate functions for each section (like i.e. section_helpers) which are called in do_probe. This change is motivated by better readability. Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200226165941.6379-2-mrostecki@opensuse.org
2020-02-26kbuild: add dt_binding_check to PHONY in a correct placeMasahiro Yamada1-1/+2
The dt_binding_check is added to PHONY, but it is invisible when $(dtstree) is empty. So, it is not specified as phony for ARCH=x86 etc. Add it to PHONY outside the ifneq ... endif block. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Rob Herring <robh@kernel.org>
2020-02-26kbuild: add dtbs_check to PHONYMasahiro Yamada1-1/+1
The dtbs_check should be a phony target, but currently it is not specified so. 'make dtbs_check' works even if a file named 'dtbs_check' exists because it depends on another phony target, scripts_dtc, but we should not rely on it. Add dtbs_check to PHONY. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Rob Herring <robh@kernel.org>
2020-02-26kbuild: remove unneeded semicolon at the end of cmd_dtb_checkMasahiro Yamada1-1/+1
This trailing semicolon is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Rob Herring <robh@kernel.org>
2020-02-26kbuild: fix DT binding schema rule to detect command line changesMasahiro Yamada1-2/+2
This if_change_rule is not working properly; it cannot detect any command line change. The reason is because cmd-check in scripts/Kbuild.include compares $(cmd_$@) and $(cmd_$1), but cmd_dtc_dt_yaml does not exist here. For if_change_rule to work properly, the stem part of cmd_* and rule_* must match. Because this cmd_and_fixdep invokes cmd_dtc, this rule must be named rule_dtc. Fixes: 4f0e3a57d6eb ("kbuild: Add support for DT binding schema checks") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Rob Herring <robh@kernel.org>
2020-02-26kbuild: remove wrong documentation about mandatory-yMasahiro Yamada1-3/+0
This sentence does not make sense in the section about mandatory-y. This seems to be a copy-paste mistake of commit fcc8487d477a ("uapi: export all headers under uapi directories"). The correct description would be "The convention is to list one mandatory-y per line ...". I just removed it instead of fixing it. If such information is needed, it could be commented in include/asm-generic/Kbuild and include/uapi/asm-generic/Kbuild. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-02-26kbuild: add comment for V=2 modeRandy Dunlap1-0/+1
Complete the comments for valid values of KBUILD_VERBOSE, specifically for KBUILD_VERBOSE=2. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-02-26iavf: use tc_cls_can_offload_and_chain0() instead of chain checkJiri Pirko1-3/+5
Looks like the iavf code actually experienced a race condition, when a developer took code before the check for chain 0 was put to helper. So use tc_cls_can_offload_and_chain0() helper instead of direct check and move the check to _cb() so this is similar to i40e code. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26bpftool: Support struct_ops, tracing, ext prog typesAndrey Ignatov4-4/+9
Add support for prog types that were added to kernel but not present in bpftool yet: struct_ops, tracing, ext prog types and corresponding section names. Before: # bpftool p l ... 184: type 26 name test_subprog3 tag dda135a7dc0daf54 gpl loaded_at 2020-02-25T13:28:33-0800 uid 0 xlated 112B jited 103B memlock 4096B map_ids 136 btf_id 85 185: type 28 name new_get_skb_len tag d2de5b87d8e5dc49 gpl loaded_at 2020-02-25T13:28:33-0800 uid 0 xlated 72B jited 69B memlock 4096B map_ids 136 btf_id 85 After: # bpftool p l ... 184: tracing name test_subprog3 tag dda135a7dc0daf54 gpl loaded_at 2020-02-25T13:28:33-0800 uid 0 xlated 112B jited 103B memlock 4096B map_ids 136 btf_id 85 185: ext name new_get_skb_len tag d2de5b87d8e5dc49 gpl loaded_at 2020-02-25T13:28:33-0800 uid 0 xlated 72B jited 69B memlock 4096B map_ids 136 btf_id 85 Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225223441.689109-1-rdna@fb.com
2020-02-26scripts/bpf: Switch to more portable python3 shebangScott Branden1-1/+1
Change "/usr/bin/python3" to "/usr/bin/env python3" for more portable solution in bpf_helpers_doc.py. Signed-off-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225205426.6975-1-scott.branden@broadcom.com
2020-02-26selftests: nft_concat_range: Add test for reported add/flush/add issueStefano Brivio1-4/+39
Add a specific test for the crash reported by Phil Sutter and addressed in the previous patch. The test cases that, in my intention, should have covered these cases, that is, the ones from the 'concurrency' section, don't run these sequences tightly enough and spectacularly failed to catch this. While at it, define a convenient way to add these kind of tests, by adding a "reported issues" test section. It's more convenient, for this particular test, to execute the set setup in its own function. However, future test cases like this one might need to call setup functions, and will typically need no tools other than nft, so allow for this in check_tools(). The original form of the reproducer used here was provided by Phil. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-02-26nft_set_pipapo: Actually fetch key data in nft_pipapo_remove()Stefano Brivio1-2/+4
Phil reports that adding elements, flushing and re-adding them right away: nft add table t '{ set s { type ipv4_addr . inet_service; flags interval; }; }' nft add element t s '{ 10.0.0.1 . 22-25, 10.0.0.1 . 10-20 }' nft flush set t s nft add element t s '{ 10.0.0.1 . 10-20, 10.0.0.1 . 22-25 }' triggers, almost reliably, a crash like this one: [ 71.319848] general protection fault, probably for non-canonical address 0x6f6b6e696c2e756e: 0000 [#1] PREEMPT SMP PTI [ 71.321540] CPU: 3 PID: 1201 Comm: kworker/3:2 Not tainted 5.6.0-rc1-00377-g2bb07f4e1d861 #192 [ 71.322746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.org-2.fc31 04/01/2014 [ 71.324430] Workqueue: events nf_tables_trans_destroy_work [nf_tables] [ 71.325387] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables] [ 71.326164] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b [ 71.328423] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282 [ 71.329225] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0 [ 71.330365] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a [ 71.331473] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000 [ 71.332627] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2 [ 71.333615] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0 [ 71.334596] FS: 0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000 [ 71.335780] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 71.336577] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0 [ 71.337533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 71.338557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 71.339718] Call Trace: [ 71.340093] nft_pipapo_destroy+0x7a/0x170 [nf_tables_set] [ 71.340973] nft_set_destroy+0x20/0x50 [nf_tables] [ 71.341879] nf_tables_trans_destroy_work+0x246/0x260 [nf_tables] [ 71.342916] process_one_work+0x1d5/0x3c0 [ 71.343601] worker_thread+0x4a/0x3c0 [ 71.344229] kthread+0xfb/0x130 [ 71.344780] ? process_one_work+0x3c0/0x3c0 [ 71.345477] ? kthread_park+0x90/0x90 [ 71.346129] ret_from_fork+0x35/0x40 [ 71.346748] Modules linked in: nf_tables_set nf_tables nfnetlink 8021q [last unloaded: nfnetlink] [ 71.348153] ---[ end trace 2eaa8149ca759bcc ]--- [ 71.349066] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables] [ 71.350016] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b [ 71.350017] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282 [ 71.350019] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0 [ 71.350019] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a [ 71.350020] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000 [ 71.350021] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2 [ 71.350022] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0 [ 71.350025] FS: 0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000 [ 71.350026] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 71.350027] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0 [ 71.350028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 71.350028] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 71.350030] Kernel panic - not syncing: Fatal exception [ 71.350412] Kernel Offset: disabled [ 71.365922] ---[ end Kernel panic - not syncing: Fatal exception ]--- which is caused by dangling elements that have been deactivated, but never removed. On a flush operation, nft_pipapo_walk() walks through all the elements in the mapping table, which are then deactivated by nft_flush_set(), one by one, and added to the commit list for removal. Element data is then freed. On transaction commit, nft_pipapo_remove() is called, and failed to remove these elements, leading to the stale references in the mapping. The first symptom of this, revealed by KASan, is a one-byte use-after-free in subsequent calls to nft_pipapo_walk(), which is usually not enough to trigger a panic. When stale elements are used more heavily, though, such as double-free via nft_pipapo_destroy() as in Phil's case, the problem becomes more noticeable. The issue comes from that fact that, on a flush operation, nft_pipapo_remove() won't get the actual key data via elem->key, elements to be deleted upon commit won't be found by the lookup via pipapo_get(), and removal will be skipped. Key data should be fetched via nft_set_ext_key(), instead. Reported-by: Phil Sutter <phil@nwl.cc> Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-02-26Merge branch 'master' of git://blackhole.kfki.hu/nfPablo Neira Ayuso3-206/+474
Jozsef Kadlecsik says: ==================== ipset patches for nf The first one is larger than usual, but the issue could not be solved simpler. Also, it's a resend of the patch I submitted a few days ago, with a one line fix on top of that: the size of the comment extensions was not taken into account at reporting the full size of the set. - Fix "INFO: rcu detected stall in hash_xxx" reports of syzbot by introducing region locking and using workqueue instead of timer based gc of timed out entries in hash types of sets in ipset. - Fix the forceadd evaluation path - the bug was also uncovered by the syzbot. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-02-26net/mlx5: sparse: warning: Using plain integer as NULL pointerSaeed Mahameed1-1/+1
Return NULL instead of 0. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
2020-02-26net/mlx5: sparse: warning: incorrect type in assignmentSaeed Mahameed1-1/+1
drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:191:13: sparse: warning: incorrect type in assignment (different base types) Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
2020-02-26net/mlx5: Fix header guard in rsc_dump.hNathan Chancellor1-1/+1
Clang warns: In file included from ../drivers/net/ethernet/mellanox/mlx5/core/main.c:73: ../drivers/net/ethernet/mellanox/mlx5/core/diag/rsc_dump.h:4:9: warning: '__MLX5_RSC_DUMP_H' is used as a header guard here, followed by #define of a different macro [-Wheader-guard] #ifndef __MLX5_RSC_DUMP_H ^~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/mellanox/mlx5/core/diag/rsc_dump.h:5:9: note: '__MLX5_RSC_DUMP__H' is defined here; did you mean '__MLX5_RSC_DUMP_H'? #define __MLX5_RSC_DUMP__H ^~~~~~~~~~~~~~~~~~ __MLX5_RSC_DUMP_H 1 warning generated. Make them match to get the intended behavior and remove the warning. Fixes: 12206b17235a ("net/mlx5: Add support for resource dump") Link: https://github.com/ClangBuiltLinux/linux/issues/897 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26Documentation: fix vxlan typo in mlx5.rstHans Wippel1-1/+1
Fix a vxlan typo in the mlx5 driver documentation. Signed-off-by: Hans Wippel <ndev@hwipl.net> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: RX, Use indirect calls wrapper for handling compressed completionsTariq Toukan1-2/+4
We can avoid an indirect call per compressed completion wrapping the completion handling call with the appropriate helper. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: RX, Use indirect calls wrapper for posting descriptorsTariq Toukan1-2/+9
We can avoid an indirect call per NAPI cycle wrapping the RX descriptors posting call with the appropriate helper. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: Change inline mode correctly when changing trust stateMaxim Mikityanskiy1-22/+33
The current steps that are performed when the trust state changes, if the channels are active: 1. The trust state is changed in hardware. 2. The new inline mode is calculated. 3. If the new inline mode is different, the channels are recreated using the new inline mode. This approach has some issues: 1. There is a time gap between changing trust state in hardware and starting sending enough inline headers (the latter happens after recreation of channels). It leads to failed transmissions and error CQEs. 2. If the new channels fail to open, we'll be left with the old ones, but the hardware will be configured for the new trust state, so the interval when we can see TX errors never ends. This patch fixes the issues above by moving the trust state change into the preactivate hook that runs during the recreation of the channels when no channels are active, so it eliminates the gap of partially applied configuration. If the inline mode doesn't change with the change of the trust state, the channels won't be recreated, just like before this patch. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: Add context to the preactivate hookMaxim Mikityanskiy6-27/+45
Sometimes the preactivate hook of mlx5e_safe_switch_channels needs more parameters than just struct mlx5e_priv *. For such cases, a new parameter (void *context) is added to preactivate hooks. Some of the existing normal functions are currently used as preactivate callbacks. To avoid adding an extra unused parameter, they are wrapped in an automatic way using the MLX5E_DEFINE_PREACTIVATE_WRAPPER_CTX macro. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: Allow mlx5e_switch_priv_channels to fail and recoverMaxim Mikityanskiy1-7/+27
Currently mlx5e_switch_priv_channels expects that the preactivate hook doesn't fail, however, it can fail, because it may set hardware parameters. This commit addresses this issue and provides a way to recover from failures of the preactivate hook: the old channels are not closed until the point where nothing can fail anymore, so in case preactivate fails, the driver can roll back the old channels and activate them again. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: Remove unneeded netif_set_real_num_tx_queuesMaxim Mikityanskiy1-6/+0
The number of queues is now updated by mlx5e_update_netdev_queues in a centralized way, when no channels are active. Remove an extra occurrence of netif_set_real_num_tx_queues to prepare it for the next commit. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: Fix configuration of XPS cpumasks and netdev queues in corner casesMaxim Mikityanskiy2-41/+65
Currently, mlx5e notifies the kernel about the number of queues and sets the default XPS cpumasks when channels are activated. This implementation has several corner cases, in which the kernel may not be updated on time, or XPS cpumasks may be reset when not directly touched by the user. This commit fixes these corner cases to match the following expected behavior: 1. The number of queues always corresponds to the number of channels configured. 2. XPS cpumasks are set to driver's defaults on netdev attach. 3. XPS cpumasks set by user are not reset, unless the number of channels changes. If the number of channels changes, they are reset to driver's defaults. (In general case, when the number of channels increases or decreases, it's not possible to guess how to convert the current XPS cpumasks to work with the new number of channels, so we let the user reconfigure it if they change the number of channels.) XPS cpumasks are no longer stored per channel. Only one temporary cpumask is used. The old stored cpumasks didn't reflect the user's changes and were not used after applying them. A scratchpad area is added to struct mlx5e_priv. As cpumask_var_t requires allocation, and the preactivate hook can't fail, we need to preallocate the temporary cpumask in advance. It's stored in the scratchpad. Fixes: 149e566fef81 ("net/mlx5e: Expand XPS cpumask to cover all online cpus") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: Use preactivate hook to set the indirection tableMaxim Mikityanskiy3-10/+17
mlx5e_ethtool_set_channels updates the indirection table before switching to the new channels. If the switch fails, the indirection table is new, but the channels are old, which is wrong. Fix it by using the preactivate hook of mlx5e_safe_switch_channels to update the indirection table at the stage when nothing can fail anymore. As the code that updates the indirection table is now encapsulated into a new function, use that function in the attach flow when the driver has to reduce the number of channels, and prepare the code for the next commit. Fixes: 85082dba0a ("net/mlx5e: Correctly handle RSS indirection table when changing number of channels") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: Rename hw_modify to preactivateMaxim Mikityanskiy2-9/+11
mlx5e_safe_switch_channels accepts a callback to be called before activating new channels. It is intended to configure some hardware parameters in cases where channels are recreated because some configuration has changed. Recently, this callback has started being used to update the driver's internal MLX5E_STATE_XDP_OPEN flag, and the following patches also intend to use this callback for software preparations. This patch renames the hw_modify callback to preactivate, so that the name fits better. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-26net/mlx5e: Encapsulate updating netdev queues into a functionMaxim Mikityanskiy1-7/+12
As a preparation for one of the following commits, create a function to encapsulate the code that notifies the kernel about the new amount of RX and TX queues. The code will be called multiple times in the next commit. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>